Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix avx512vbmi swizzle_dyn implementation #431

Merged
merged 1 commit into from
Aug 27, 2024

Conversation

cvijdea-bd
Copy link
Contributor

@cvijdea-bd cvijdea-bd commented Aug 25, 2024

With target_features = "avx512vbmi", swizzle_dyn with N = 32 did not set output lanes to 0 when the input index was out of range, because it used _mm256_permutexvar_epi8 (vpermb) which, unlike _mm256_shuffle_epi8 (vpshufb), does not provide that behaviour.

This PR fixes the problem and adds the avx512vbmi implementation for N = 64.

@calebzulawski
Copy link
Member

Looks good to me! Thanks! FYI @workingjubilee

@calebzulawski calebzulawski merged commit f6519c5 into rust-lang:master Aug 27, 2024
57 checks passed
@workingjubilee
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants