-
Nol Moonen authored
* First attempt to port. Committing to switch remote machine * Unnecessary includes removed, now using CUDA events for time measurement and Readme.md and CMakeLists.txt updated. Benchmark output compared with legacy output and no significant differences were found. * Engine CURAND_RNG_PSEUDO_MT19937 removed (it was not in legacy benchmark) * Fix format * gitlab-ci fix * Small fixes. Changed benchmark_curand_generate back to legacy and created benchmark_curand_host_api with new implementation * Format fix * Bracket fix * Line fix * Order parameter added. rocrand_set_ordering added for mrg32k3a engine * Fixed some comments, removed OrderingType and modified rocrand.hpp * add nvcc-specific hiprand cmake variable * Line break * Fixed ordering options documentation and added support for the generators left * Copyright year updated * Added delegating constructor for rocrand_generator_type, added incorrect order value check in rocrand_set_ordering and fixed some copydoc * Minor fixes * Parameters order fix * Reordering members of rocrand_generator_type * implement mt19937 * Changed arguments order in generators constructors * fix incorrect free call * small fix in head and tail processing * fix freeing wrong pointer in test * restore merge error * remove unused variable * minor performance improvement * formatting, workaround for warp size on nvidia * review comments round 1 * review comments round 2 * review comments round 3 * skipping test now checks for subsequence size that is not multiple of state size * review comments round 4 * review comments round 5 * add mt19937 to benchmark_curand_host_api * small readability improvement * add mt19937 to readme * review comments round 6 * Workaround stream hang * remove workaround stream hang * add rocrand kernel benchmark using google benchmark * review comments round 1 * fix incorrect name for 64-bits generators * replace deprecated built-in hip variables * hip-cpu compatibility * fix kernel benchmark for scrambled sobol * fix sobol64 test type * add curand kernel benchmark using google benchmark * review comments round 1 * Remove the Python2 test case from CI * Port rocRAND to HIP-CPU * surpress warning on integer conversion * Add draft of rocrand_threefry.h * Update draft of ThreeFry * Finish implementing ThreeFry * Add kernel benchmark for ThreeFry * Add a test for ThreeFry * Add uint2_64 * Threefry bug fixes * Add a test to generate raw binary data of PRNG engine to use with Dieharder * Fix engine and some improvements * Update threefry2x64 and add threefry2x32, threefry4x32 and threefry4x64 * Extend benchmarks * Add test kernel threefry * Enable normal tests * Enable generate threefry tests * Bugfix for lognormal and normal * Add to benchmark_rocrand_host_api.cpp and rocrand_kernel.h * Deal with more than 2^32 numbers at discard state * Disable trivially copyable test * Updated Changelog and Readme * Rename threefry and add caching * Rename state and macro * Implement 4 different threefry * Clang-format * Update python tests * Update the threefry prngs and tests * 64 bit partial support * 2 state generare kernel incorrect * Fix unit tests * clang-format * resolve -Wimplicit-const-int-float-conversion * fix overflow in mean calculation * fix bug in discard and discard subsequence, update tests accordingly, clarify documentation * fix another occurrence of overflow in mean calculation, improve first instance * use proper divistor in continuity test for 64-bits threefry generator * add ordering to threeyfry * unify generate_64 and generate_long_long * enable threefry host tests, remove unnecessary compile option that gave problems * small fix in comments, use expf instead of exp where possible * fix 64-bits poisson distribution interface, benchmarks, and tests * run clang-format * minor improvements * disable python test for 32-bits poisson from 64-bits generator * fix merge problems * add to device api benchmark * remove 64-bits poisson * hip-cpu compatibility * fix wrong type in test * review comments round 1 * undo loop unroll * disable 4x64 test for hip-cpu * simplify core for loop * add missing default dimension definition * fix poisson benchmark name * apply benchmark format changes to threefry * parity test infrastructure * parity test generation/running * report results in csv * rocrand_set_ordering for mt19937 * make parity test standalone * formatting * parity test ci * remove todo comment * fix typo * improve rocrand docs * fix versions * move source to source directory, split up c/c++ api reference into separate sections * python docs: fix crossreferencing, constructor docs * put generator types docs in programmer's guide * review fixes * improve incremental compilation time Instead of putting the large sobol constant arrays in the header, they are now put in a separate compilation unit. This prevents recompilation of those files when something is altered in the main library, and so should reduce the time it takes to incrementally compile rocrand after a modification to the core library. * ordering documentation * add curand compatibility table * use clang-15 to compile hip-cpu * fix linking errors on windows Co-authored-by:
Beatriz Navidad Vilches <beatriz@streamhpc.com> Co-authored-by:
Mátyás Aradi <matyas@streamhpc.com> Co-authored-by:
Théo Battrel <theo@streamhpc.com> Co-authored-by:
Istvan Kiss <istvan@streamhpc.com> Co-authored-by:
Robin Voetter <robin@streamhpc.com>
11cf251f