Skip to content

Disable sqrtf vectorization for GCC 15 #18975

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

kmilos
Copy link
Contributor

@kmilos kmilos commented Jun 21, 2025

Alternative (more maintainable?) workaround for #18729

Copy link
Collaborator

@ralfbrown ralfbrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the Bugzilla thread linked in the other issue, I think we should just disable the declarations for all of the functions under GCC, as only removing the declartion for the one currently causing problems leaves latent issus for the future.

@jenshannoschwalm
Copy link
Collaborator

I fully agree with @ralfbrown

@kmilos
Copy link
Contributor Author

kmilos commented Jun 21, 2025

Sure. This PR is just to undo the previous workaround and have a better/cleaner starting point. As mentioned in the bug report, someone also needs to check what's hapenning w/ Clang+glibc as it also defines __GNUC__.

@TurboGit TurboGit added this to the 5.2.1 milestone Jun 21, 2025
@TurboGit TurboGit added bugfix pull request fixing a bug priority: high core features are broken and not usable at all, software crashes release notes: pending labels Jun 21, 2025
@TurboGit
Copy link
Member

@jenshannoschwalm @ralfbrown @kmilos
Are you all ok for having this in 5.2.1 and move to the full fix for 5.4? Any other proposal?

@jenshannoschwalm
Copy link
Collaborator

I would prefer @ralfbrown suggestion in #18729 leaving optimizing to whats available.

#if defined(_OPENMP) && !defined(_WIN32) && !defined(__GNUC__)
// GCC has declarations in standard headers if and only if the architecture and compile options support vectorization, so
// don't try to override that here - causes dynamic-link errors at startup.

@kmilos
Copy link
Contributor Author

kmilos commented Jun 23, 2025

// GCC has declarations ...

Clang also defines __GNUC__

Again, the original bug seems to be glibc specific, not GCC specific. On top of that, there is musl, BSD libc, etc., and all those combine w/ GCC and Clang. I cannot be 100% confident what you propose is the "final" fix, as I don't have access to all of those platforms, nor the time to test all. If you and @ralfbrown are, please feel free to follow up w/ direct code suggestions, commits to a fork of my branch, or a subsequent PR (my preference after merging this one).

@kmilos
Copy link
Contributor Author

kmilos commented Jun 23, 2025

FWIW, these are available in libmvec.so shipped w/ glibc on Fedora 42:

_ZGVdN8vv_atan2f
_ZGVcN8v_sinhf
_ZGVdN4v_atan
_ZGVbN4v_acosf
_ZGVcN8v_erfcf
_ZGVdN8v_sinf
_ZGVdN8vv_powf
_ZGVcN8v_expf
_ZGVeN8v_tan
_ZGVbN2v_exp
_ZGVeN16v_sinf
_ZGVdN4v_log
_ZGVcN8v_tanhf
_ZGVdN8v_log2f
_ZGVbN4v_erff
_ZGVcN8v_log10f
_ZGVbN2vvv_sincos
_ZGVdN8v_log1pf
_ZGVbN2vv_pow
_ZGVcN4vvv_sincos
_ZGVdN4v_asin
_ZGVeN8v_erf
_ZGVbN4v_cosf
_ZGVcN4vv_pow
_ZGVcN4v_atanh
_ZGVeN16v_expm1f
_ZGVeN16v_log2f
_ZGVeN16vvv_sincosf
_ZGVdN8v_erff
_ZGVeN8v_sin
_ZGVcN8v_atanf
_ZGVdN8v_cbrtf
_ZGVeN16v_erff
_ZGVdN8v_log10f
_ZGVdN4v_cos
_ZGVcN4v_expm1
_ZGVbN4v_expf
_ZGVdN8v_exp2f
_ZGVdN8v_cosf
_ZGVeN8vv_pow
_ZGVeN16v_cosf
_ZGVeN8vv_atan2
_ZGVcN4v_exp
_ZGVbN4vv_powf
_ZGVeN16v_cbrtf
_ZGVbN2v_atanh
_ZGVbN2v_log
GLIBC_2.22
_ZGVdN8v_expf
_ZGVbN4v_log2f
_ZGVeN16v_exp2f
_ZGVeN16v_expf
_ZGVbN2v_expm1
_ZGVbN2vv_atan2
_ZGVbN2v_log2
_ZGVeN8v_asinh
_ZGVeN16v_asinhf
_ZGVeN16vv_atan2f
_ZGVcN4v_log2
_ZGVbN4vv_atan2f
_ZGVdN4vv_pow
_ZGVbN4v_cbrtf
_ZGVbN2v_cos
_ZGVeN16v_atanhf
GLIBC_2.35
_ZGVeN8v_log2
_ZGVbN4v_exp2f
_ZGVeN8v_log10
_ZGVcN8vvv_sincosf
_ZGVcN8v_acoshf
_ZGVcN4v_log
_ZGVbN4v_exp10f
_ZGVeN8v_acosh
_ZGVeN16v_log1pf
_ZGVcN4vv_hypot
_ZGVcN4v_asinh
_ZGVcN8v_asinf
_ZGVeN8v_exp10
_ZGVdN8v_acoshf
_ZGVdN4v_log2
_ZGVcN4v_cos
_ZGVeN16v_log10f
_ZGVdN4v_tan
_ZGVdN4vv_hypot
_ZGVbN4v_expm1f
_ZGVbN2v_asinh
_ZGVcN4v_log10
_ZGVcN8v_coshf
_ZGVeN8v_log1p
_ZGVdN4v_erf
_ZGVdN8v_sinhf
_ZGVcN4v_acosh
_ZGVdN8v_erfcf
_ZGVbN2v_tanh
_ZGVcN8v_acosf
_ZGVdN4v_sin
_ZGVcN4v_tanh
_ZGVbN2v_log10
_ZGVdN8v_tanhf
_ZGVcN4v_exp10
_ZGVeN16v_sinhf
_ZGVeN16v_erfcf
_ZGVbN2v_acosh
_ZGVbN2v_tan
_ZGVbN4v_asinhf
_ZGVeN8v_tanh
_ZGVcN8v_logf
_ZGVdN4v_atanh
_ZGVeN16v_tanhf
_ZGVcN8vv_hypotf
_ZGVdN8v_atanf
_ZGVbN2v_erf
_ZGVbN2v_exp10
_ZGVdN4v_expm1
_ZGVbN4v_atanhf
_ZGVcN4v_log1p
_ZGVeN8v_exp
_ZGVdN8vv_hypotf
_ZGVbN4v_sinhf
_ZGVbN4v_erfcf
_ZGVbN2v_sin
_ZGVbN2v_acos
_ZGVeN16v_atanf
_ZGVbN2v_cbrt
_ZGVcN4v_acos
_ZGVbN4v_tanhf
_ZGVcN8vv_powf
_ZGVeN16v_acoshf
_ZGVcN4v_cbrt
_ZGVdN4v_tanh
_ZGVbN4v_logf
_ZGVbN2v_log1p
_ZGVbN4v_log1pf
_ZGVcN8v_log2f
_ZGVcN4v_tan
_ZGVcN8v_tanf
_ZGVeN8v_acos
_ZGVeN8v_cbrt
_ZGVdN8v_logf
_ZGVeN16vv_powf
_ZGVbN4v_atanf
_ZGVeN16v_logf
_ZGVcN4v_erf
_ZGVbN4v_log10f
_ZGVcN8v_exp10f
_ZGVcN8v_cbrtf
_ZGVbN2v_sinh
_ZGVcN4v_sin
_ZGVeN8v_log
_ZGVcN4vv_atan2
_ZGVcN8v_exp2f
_ZGVbN2v_exp2
_ZGVcN4v_sinh
_ZGVcN4v_exp2
_ZGVbN4v_tanf
_ZGVdN4v_acos
_ZGVbN2v_erfc
_ZGVdN8v_exp10f
_ZGVdN4v_cbrt
_ZGVeN8v_sinh
_ZGVdN4v_asinh
_ZGVeN16vv_hypotf
_ZGVdN4vv_atan2
_ZGVeN8vv_hypot
_ZGVcN4v_erfc
_ZGVeN8v_cos
_ZGVdN8v_asinf
_ZGVeN8v_exp2
_ZGVcN8v_expm1f
_ZGVbN4vv_hypotf
_ZGVdN8v_tanf
_ZGVbN2v_cosh
_ZGVdN4vvv_sincos
_ZGVeN16v_tanf
_ZGVbN2vv_hypot
_ZGVcN4v_cosh
_ZGVeN8v_erfc
_ZGVeN16v_asinf
_ZGVdN4v_log10
_ZGVdN8v_coshf
_ZGVbN2v_atan
_ZGVdN8v_expm1f
_ZGVdN4v_sinh
_ZGVeN8v_cosh
_ZGVcN4v_atan
_ZGVdN4v_acosh
_ZGVdN4v_exp2
_ZGVdN8v_acosf
_ZGVeN16v_coshf
_ZGVcN8v_sinf
_ZGVcN8v_asinhf
_ZGVdN8vvv_sincosf
_ZGVbN4v_asinf
_ZGVbN2v_asin
_ZGVdN4v_exp10
_ZGVbN4v_acoshf
_ZGVdN4v_exp
_ZGVeN8v_atan
_ZGVdN4v_erfc
_ZGVeN16v_acosf
_ZGVcN8v_atanhf
_ZGVcN4v_asin
_ZGVeN8vvv_sincos
_ZGVbN4vvv_sincosf
_ZGVdN8v_asinhf
_ZGVdN4v_cosh
_ZGVcN8v_erff
_ZGVeN8v_atanh
_ZGVbN4v_coshf
_ZGVcN8vv_atan2f
_ZGVdN4v_log1p
_ZGVeN8v_asin
_ZGVbN4v_sinf
_ZGVcN8v_cosf
_ZGVeN8v_expm1
_ZGVdN8v_atanhf
_ZGVeN16v_exp10f
_ZGVcN8v_log1pf

fabsf needs to be excluded as well...

@jenshannoschwalm
Copy link
Collaborator

I am absolutely no compiler expert but from what i understood we should trust the compiler using available functions for vectorizing if possible. Not rying to enforce something not being there, otherwise lot's of secific test cases.
In other words, gcc should not be treated different from clang. Is this understanding wrong?

@kmilos
Copy link
Contributor Author

kmilos commented Jun 23, 2025

I am absolutely no compiler expert but from what i understood we should trust the compiler using available functions for vectorizing if possible.

I guess, because it should know what C lib/math lib it's built with?

gcc should not be treated different from clang. Is this understanding wrong?

I suspect they are not able to auto-vectorize everything identically, so that's why I say I'm not 100% sure.

@jenshannoschwalm
Copy link
Collaborator

Not sure either but it's likely the very best guess?

@kmilos
Copy link
Contributor Author

kmilos commented Jun 23, 2025

it's likely the very best guess?

For me quite the opposite - as we've learned even the same compiler vectorizes differently between its versions, and also depending on (unknown) options..

@ralfbrown
Copy link
Collaborator

For me quite the opposite - as we've learned even the same compiler vectorizes differently between its versions, and also depending on (unknown) options..

Which is precisely why we shouldn't try to force the use of vectorized stdlib functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix pull request fixing a bug priority: high core features are broken and not usable at all, software crashes release notes: pending
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants