-
Notifications
You must be signed in to change notification settings - Fork 451
[JAX] GEMM custom op #1855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
denera
wants to merge
41
commits into
NVIDIA:main
Choose a base branch
from
denera:jax/nvte-cublas-gemm-op
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
[JAX] GEMM custom op #1855
Changes from all commits
Commits
Show all changes
41 commits
Select commit
Hold shift + click to select a range
cf1774c
added XLA FFI custom op for TE/common nvte_cublas_gemm
denera da0709a
minor unit test cleanup
denera e5b933c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 92dec51
FP8 tests passing on Blackwell but MXFP8 outputs NaN
denera 50d319b
Merge branch 'jax/nvte-cublas-gemm-op' of github.com:denera/Transform…
denera 9eba586
reverted dense and fuseddense changes, FP8 test passing on Hopper and…
denera b80e284
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] a7aa2f4
MXFP8 issue traced to scale factor padding with NaNs instead of zeros
denera 1be8773
padding scale with 2^-127 instead of nans
phu0ngng 75008de
fix bug on rhs_scale_inv usage
phu0ngng 5b0c1f5
cleanup E8M0 type converter use it in gemm.cpp
phu0ngng b49d586
segfault fixed, passing all unittests on Blackwell
denera b760460
merge with main
phu0ngng bd9bca3
fix for fuseddense tests
phu0ngng 8fcb1bb
fix workspace alignment
phu0ngng b2b4159
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] ae4828c
fixed GemmPrimitive custom partitioning to match jax.nn.scaled_matmul
denera 17d7a51
Merge remote-tracking branch 'upstream/main' into jax/nvte-cublas-gem…
denera ddaaab9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 44e5b81
moved reshape of encoder output in encoder examples to make custom pa…
denera a281c97
Merge remote-tracking branch 'upstream/main' into jax/nvte-cublas-gem…
denera b8ca0b1
added helper functions for padding and unpadding block scales, change…
denera 3ee96ba
Merge remote-tracking branch 'upstream/main' into jax/nvte-cublas-gem…
denera 7187582
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 0230a5e
updated shardy rules for all custom ops to decouple block scale rules…
denera dedf5e9
Merge remote-tracking branch 'upstream/main' into jax/nvte-cublas-gem…
denera 875f401
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 995fb11
fixed linting errors
denera cb3613d
Merge remote-tracking branch 'upstream/main' into jax/nvte-cublas-gem…
denera d9e55a9
changed unit test use_jax_gemm option to be a context to preserve ext…
denera 9c2a56c
Merge remote-tracking branch 'upstream/main' into jax/nvte-cublas-gem…
denera e850ab5
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 4db82a0
fixed typo in test utils
denera 14da5c8
added sequence-first input warnings
denera d5cb233
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 8cbe0e2
fixed datasets version for JAX examples
denera 3c82160
Merge remote-tracking branch 'upstream/main' into jax/nvte-cublas-gem…
denera 66eab76
reverting modification to force_1x_quantization decision
denera 9781ebf
corrected gemm function syntax in unit tests
denera bb174bb
Merge remote-tracking branch 'upstream/main' into jax/nvte-cublas-gem…
denera 8532ad0
Merge remote-tracking branch 'upstream/main' into jax/nvte-cublas-gem…
denera File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.