Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix build options for native test lib to not enable avx-512 #61229

Merged
merged 2 commits into from
Nov 5, 2021

Conversation

AndyAyersMS
Copy link
Member

@AndyAyersMS AndyAyersMS commented Nov 4, 2021

Just enable AVX2, so that we can run the lib on any of our test machines.

Should fix intermittent failures we're seeing in Interop/PInvoke/Generics/GenericsTest on linux x64, where the core dumps I've looked at show machines faulting on an AVX-512 instruction in libGenericsNative!GetVector128BOut:

00007f2f`30b98b20 c5f990442450    kmovb   k0,byte ptr [rsp+50h]

Presumably some of our test machines can't handle these instructions.

…-512

Just enable AVX2, so that we can run the lib on any of our test machines.
@ghost
Copy link

ghost commented Nov 4, 2021

Tagging subscribers to this area: @hoyosjs
See info in area-owners.md if you want to be subscribed.

Issue Details

Just enable AVX2, so that we can run the lib on any of our test machines.

Should fix intermittent failures we're seeing in Interop/PInvoke/Generics/GenericsTest on linux x64.

Author: AndyAyersMS
Assignees: -
Labels:

area-Infrastructure-coreclr

Milestone: -

@AndyAyersMS
Copy link
Member Author

Not sure if this will alters what the test does, but we need to generate code more predictably.

cc @krwq @elinor-fung @jkoritzinsky @AaronRobinsonMSFT

@AndyAyersMS
Copy link
Member Author

Pinging a few more folks as it would be nice to get this merged (or find the "correct" fix).

@tannergooding @agocke @jkotas

# We need -march=native so we can detect if AVX2 is present.
# ARM does not like that option too and it make no sense to have this detection there.
add_compile_options(-march=native)
add_compile_options(-mavx2)
Copy link
Member

@tannergooding tannergooding Nov 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may still fail on some hardware. Notably, I believe we have some Mac Minis that support AVX but not AVX2 AFAIR.

For GCC/Clang, I believe we can use __attribute__((target("avx2"))) instead to opt specific methods into AVX2 support: https://godbolt.org/z/sq9zv119Y

  • The sample shows that GetZero1 has no error but GetZero2 does. Commenting out GetZero2 shows that GetZero1 successfully compiles despite using AVX2 instructions due to the annotation.

The C# side should already be guarding and therefore preventing the method from being called if Avx2.IsSupported returns false.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we could just use -mavx perhaps?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's still potentially problematic and will prevent users who are on older hardware from running the tests.

I don't think it would be complex to do the find/replace here either. The files that use __m256/__m256i are well defined and the pattern of the names is consistent, so something like STDMETHODCALLTYPE ([a-zA-Z]*256[a-zA-Z]*)\( with STDMETHODCALLTYPE ENABLE_AVX $1( or similar should work

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're seeing a lot of CI failures from this so would be nice to get a simple fix in place soon.

If we update the native side to always use AVX2, we'd also need to modify the C# side to guard things with IsSupported, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're seeing a lot of CI failures from this so would be nice to get a simple fix in place soon.

I can push up the small fix to the CPP files.

If we update the native side to always use AVX

We can pick AVX2 (__attribute__((target("avx2")))) or AVX (__attribute__((target("avx")))) depending on what's actually needed and it only impacts the methods that we mark with the attribute. The other methods continue targeting SSE2 (as if you didn't specify -m* at all).

we'd also need to modify the C# side to guard things with IsSupported, right?

The C# side is already guarding these P/Invokes with the relevant IsSupported

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C# side is already guarding these P/Invokes with the relevant IsSupported

I believe you, but I don't see where ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C# side is already guarding these P/Invokes with the relevant IsSupported

Hmm, actually its apparently not. Its not a problem today because P/Invokes involving Vector64/128/256 by value are blocked and so the code in question isn't executed.

Given GCC/Clang don't allow methods to be compiled using __m256, __m256d, or __m256i` without targeting AVX, we're always going to have a "problem" here once they do get enabled.

Copy link
Member

@tannergooding tannergooding Nov 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

patch.txt

Either way, you can download the above and run git apply patch.txt and it should contain all the changes needed to support a new ENABLE_AVX macro and putting it on the right methods so GCC/Clang stop complaining.

We'll have to log an issue and consider how P/Invoke for Vector256<T> where AVX2 isn't supported works before enabling it in the future (its blocked today due to the Windows x64 calling convention bug #9578; it works elsewhere).

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Nov 5, 2021

Still think we should merge this now and do the more general thing later. My tests and @krwq's are all green.

@tannergooding
Copy link
Member

Still think we should merge this now and do the more general thing later. My tests and @krwq's are all green.

The change works, so LGTM. I can just put up my proposed alternative (#61229 (comment)) in a PR just after this is merged.

@AndyAyersMS AndyAyersMS merged commit 1c7100b into dotnet:main Nov 5, 2021
@AndyAyersMS AndyAyersMS deleted the FixInteropLibLLVMx64ArchOption branch November 5, 2021 18:54
@agocke
Copy link
Member

agocke commented Nov 5, 2021

Thanks for fixing this -- I'm not quite sure what it does though. Does this mean we're now using avx instead of avx2? And the build succeeds because all the build machines we're using support avx, but not necessarily avx2?

@tannergooding
Copy link
Member

Thanks for fixing this -- I'm not quite sure what it does though. Does this mean we're now using avx instead of avx2? And the build succeeds because all the build machines we're using support avx, but not necessarily avx2?

Yeah, that's basically it. I've put up #61259 as an alternative improvement that opts only the methods that need it into AVX instead.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants