Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop stream 2024-07-01 #512

Merged
merged 69 commits into from
Aug 8, 2024

Conversation

Naraenda
Copy link
Member

@Naraenda Naraenda commented Jul 1, 2024

This PR brings various updates, intended for ROCm 6.3.

It contains the following merge commits:

  • 0350e0f perf: improve performance on rocm 6.2.
  • ade3f6b fix: performance regression introduced with hip graph support.
  • c90e74c docs: add documentation for hip graph support
  • fa13dd7 feat: hip graph support for poisson distribution
  • 4224428 feat: host implementation for mt19937
  • 615fc5a perf: use alias method instead of binary search in cdf
  • 8f7e2af docs: reference GPU_TARGETS instead of AMDGPU_TARGET
  • 6807269 fix: fix host generators not always supporting large sizes
  • 50dab4f refactor: remove deprecated internal headers
  • 80c16d4 test: improve accuracy of poisson histogram test
  • 1a44b48 docs: fix styling

This PR does not contain fixes for the recent performance regressions, we'll either add them here or create a new PR for those.

@Naraenda Naraenda self-assigned this Jul 1, 2024
@Naraenda
Copy link
Member Author

Naraenda commented Jul 1, 2024

@samjwu The clang format CI step seems to not handle external PRs properly, I think this might also be happening in other repositories.

@Naraenda Naraenda marked this pull request as ready for review July 1, 2024 15:10
@Naraenda
Copy link
Member Author

Naraenda commented Jul 3, 2024

@stanleytsang-amd Let me know if I should squash the commits into their respective merge commits.

@stanleytsang-amd
Copy link
Collaborator

stanleytsang-amd commented Jul 4, 2024

@samjwu The clang format CI step seems to not handle external PRs properly, I think this might also be happening in other repositories.
@samjwu Can you take a look at the clang-format CI?

Run git fetch origin develop
From https://github.com/ROCm/rocRAND

  • branch develop -> FETCH_HEAD
  • [new branch] develop -> origin/develop
    fatal: couldn't find remote ref develop_stream_20240701
    Error: Process completed with exit code 128.

@Naraenda
Copy link
Member Author

Naraenda commented Jul 9, 2024

Rebased to get the clang-format ci fixes.

mfep and others added 15 commits July 24, 2024 07:55
rocm-docs-core distributes headers and stylesheets for doxygen for
embedding its HTML output into sphinx. These mostly fix dark-theme
and other minor visual issues when doxygen output is used this way.
docs(api reference): rocm-docs-core headers and stylesheets in doxyfile

See merge request amd/libraries/rocRAND!326
…ream'

Improve accuracy of Poisson histogram test

Closes ROCm#240

See merge request amd/libraries/rocRAND!327
…tream'

Resolve "Remove deprecated internal headers"

Closes ROCm#341

See merge request amd/libraries/rocRAND!330
AMDGPU_TARGETS doesn't pick up updates correctly (needs cache clean) whereas GPU_TARGETS does. Every other doc and CI too refers to GPU_TARGETS.
Resolve "Some host generators might not support large sizes due to min / max"

See merge request amd/libraries/rocRAND!329
mfep and others added 20 commits July 24, 2024 07:56
Recent changes required for HIP graph support added a new path with
approximation of Poisson with normal distribution when lambda is large.
However, the decision whether to use the alias/CDF methods or the
approximation is made in the kernel for every generated value even
though lambda is the same.
This change moves it to host side: depending on lambda the kernel is
launched with one of two distributions (poisson_distribution or
poisson_distribution_huge).
Resolve "Document HIP Graph support"

Closes ROCm#360

See merge request amd/libraries/rocRAND!335
…p_stream'

Fix performance regression of Poisson distribution introduced by HIP graph support

Closes ROCm#366

See merge request amd/libraries/rocRAND!336
hipcc from ROCm 6.2 does not add `-mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false`
by default.
Improve performance on ROCm 6.2

See merge request amd/libraries/rocRAND!337
ci(.gitlab-ci.yml): replace 'ROCM_PATH' variable with 'env:HIP_PATH' as the former is unintuitive

See merge request amd/libraries/rocRAND!339
Remove unused FindTestU01.cmake

See merge request amd/libraries/rocRAND!342
…te' into 'develop_stream'

Resolve "Add check for nullptr data when calling host generate."

Closes ROCm#369

See merge request amd/libraries/rocRAND!341
operator >> has higher precedence than operator &.
This bug causes very low quality in crush tests.
…ream'

Fix threefry2x64 and threefry4x64

Closes ROCm#371

See merge request amd/libraries/rocRAND!343
@Naraenda
Copy link
Member Author

Rebased and added threefry fixes by @ex-rzr

Copy link
Collaborator

@umfranzw umfranzw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've finished working my way through this one, and I think it looks good. All our CI checks are also passing now.
@stanleytsang-amd if you have no objections, I'm ok with this one going in.

Copy link
Collaborator

@stanleytsang-amd stanleytsang-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the packaging version is updated.

CMakeLists.txt Show resolved Hide resolved
@stanleytsang-amd stanleytsang-amd merged commit a296589 into ROCm:develop Aug 8, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants