• Nol Moonen's avatar
    Develop Stream 2022-11-14 (#300) · 11cf251f
    Nol Moonen authored
    
    
    * First attempt to port. Committing to switch remote machine
    
    * Unnecessary includes removed, now using CUDA events for time measurement and Readme.md and CMakeLists.txt updated. Benchmark output compared with legacy output and no significant differences were found.
    
    * Engine CURAND_RNG_PSEUDO_MT19937 removed (it was not in legacy benchmark)
    
    * Fix format
    
    * gitlab-ci fix
    
    * Small fixes. Changed benchmark_curand_generate back to legacy and created benchmark_curand_host_api with new implementation
    
    * Format fix
    
    * Bracket fix
    
    * Line fix
    
    * Order parameter added. rocrand_set_ordering added for mrg32k3a engine
    
    * Fixed some comments, removed OrderingType and modified rocrand.hpp
    
    * add nvcc-specific hiprand cmake variable
    
    * Line break
    
    * Fixed ordering options documentation and added support for the generators left
    
    * Copyright year updated
    
    * Added delegating constructor for rocrand_generator_type, added incorrect order value check in rocrand_set_ordering and fixed some copydoc
    
    * Minor fixes
    
    * Parameters order fix
    
    * Reordering members of rocrand_generator_type
    
    * implement mt19937
    
    * Changed arguments order in generators constructors
    
    * fix incorrect free call
    
    * small fix in head and tail processing
    
    * fix freeing wrong pointer in test
    
    * restore merge error
    
    * remove unused variable
    
    * minor performance improvement
    
    * formatting, workaround for warp size on nvidia
    
    * review comments round 1
    
    * review comments round 2
    
    * review comments round 3
    
    * skipping test now checks for subsequence size that is not multiple of state size
    
    * review comments round 4
    
    * review comments round 5
    
    * add mt19937 to benchmark_curand_host_api
    
    * small readability improvement
    
    * add mt19937 to readme
    
    * review comments round 6
    
    * Workaround stream hang
    
    * remove workaround stream hang
    
    * add rocrand kernel benchmark using google benchmark
    
    * review comments round 1
    
    * fix incorrect name for 64-bits generators
    
    * replace deprecated built-in hip variables
    
    * hip-cpu compatibility
    
    * fix kernel benchmark for scrambled sobol
    
    * fix sobol64 test type
    
    * add curand kernel benchmark using google benchmark
    
    * review comments round 1
    
    * Remove the Python2 test case from CI
    
    * Port rocRAND to HIP-CPU
    
    * surpress warning on integer conversion
    
    * Add draft of rocrand_threefry.h
    
    * Update draft of ThreeFry
    
    * Finish implementing ThreeFry
    
    * Add kernel benchmark for ThreeFry
    
    * Add a test for ThreeFry
    
    * Add uint2_64
    
    * Threefry bug fixes
    
    * Add a test to generate raw binary data of PRNG engine to use with Dieharder
    
    * Fix engine and some improvements
    
    * Update threefry2x64 and add threefry2x32, threefry4x32 and threefry4x64
    
    * Extend benchmarks
    * Add test kernel threefry
    * Enable normal tests
    * Enable generate threefry tests
    * Bugfix for lognormal and normal
    * Add to benchmark_rocrand_host_api.cpp and rocrand_kernel.h
    * Deal with more than 2^32 numbers at discard state
    * Disable trivially copyable test
    * Updated Changelog and Readme
    * Rename threefry and add caching
    * Rename state and macro
    * Implement 4 different threefry
    * Clang-format
    * Update python tests
    
    * Update the threefry prngs and tests
    
    * 64 bit partial support
    * 2 state generare kernel incorrect
    
    * Fix unit tests
    
    * clang-format
    
    * resolve -Wimplicit-const-int-float-conversion
    
    * fix overflow in mean calculation
    
    * fix bug in discard and discard subsequence, update tests accordingly, clarify documentation
    
    * fix another occurrence of overflow in mean calculation, improve first instance
    
    * use proper divistor in continuity test for 64-bits threefry generator
    
    * add ordering to threeyfry
    
    * unify generate_64 and generate_long_long
    
    * enable threefry host tests, remove unnecessary compile option that gave problems
    
    * small fix in comments, use expf instead of exp where possible
    
    * fix 64-bits poisson distribution interface, benchmarks, and tests
    
    * run clang-format
    
    * minor improvements
    
    * disable python test for 32-bits poisson from 64-bits generator
    
    * fix merge problems
    
    * add to device api benchmark
    
    * remove 64-bits poisson
    
    * hip-cpu compatibility
    
    * fix wrong type in test
    
    * review comments round 1
    
    * undo loop unroll
    
    * disable 4x64 test for hip-cpu
    
    * simplify core for loop
    
    * add missing default dimension definition
    
    * fix poisson benchmark name
    
    * apply benchmark format changes to threefry
    
    * parity test infrastructure
    
    * parity test generation/running
    
    * report results in csv
    
    * rocrand_set_ordering for mt19937
    
    * make parity test standalone
    
    * formatting
    
    * parity test ci
    
    * remove todo comment
    
    * fix typo
    
    * improve rocrand docs
    
    * fix versions
    
    * move source to source directory, split up c/c++ api reference into separate sections
    
    * python docs: fix crossreferencing, constructor docs
    
    * put generator types docs in programmer's guide
    
    * review fixes
    
    * improve incremental compilation time
    
    Instead of putting the large sobol constant arrays in the header, they are
    now put in a separate compilation unit. This prevents recompilation of
    those files when something is altered in the main library, and so should
    reduce the time it takes to incrementally compile rocrand after a modification
    to the core library.
    
    * ordering documentation
    
    * add curand compatibility table
    
    * use clang-15 to compile hip-cpu
    
    * fix linking errors on windows
    Co-authored-by: default avatarBeatriz Navidad Vilches <beatriz@streamhpc.com>
    Co-authored-by: default avatarMátyás Aradi <matyas@streamhpc.com>
    Co-authored-by: default avatarThéo Battrel <theo@streamhpc.com>
    Co-authored-by: default avatarIstvan Kiss <istvan@streamhpc.com>
    Co-authored-by: default avatarRobin Voetter <robin@streamhpc.com>
    11cf251f