Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build the x86 STL with /arch:SSE2 instead of /arch:IA32 #4741

Merged
merged 3 commits into from
Jun 21, 2024

Conversation

StephanTLavavej
Copy link
Member

Fixes #3118. Fixes #3922. See /arch (x86) on Microsoft Learn.

On x86, the STL (and indeed the entire VCRedist) was historically built with /arch:IA32 because it had to be capable of running on ancient OSes and potato chips. Now, Win7 / Server 2008 R2 are unsupported and no longer receiving security updates - but before they reached end-of-life, even they were patched to require SSE2. That was KB4103718 in May 2018, over 6 years ago. (Note: I am well aware of the single exception that paid security updates for the highly obscure Windows Embedded POSReady 7 will end in Oct 2024. More on that in another PR, but the point here is that even Windows 7 requires SSE2 now.)

The STL can now begin assuming unconditional support for SSE2. The compiler now defaults to /arch:SSE2, so all we need to do is remove /arch:IA32.

Why make this change? It slightly simplifies our build system and may slightly improve performance (although I don't expect it to be observable, so the PR label is honorary). It also means that our separately compiled code will be exercising the same compiler codepaths used by the vast majority of x86 builds everywhere. If the status quo were reversed and we were currently building with /arch:SSE2, we would never want to change to /arch:IA32.

This affects the VCRedist, but (1) VS 2022 17.12 will be an "unlocked" long-term support release, and (2) there are no coordinated header changes, so we don't need to worry here.

Note that although we can now assume that SSE2 is unconditionally available (as it has always been for x64), we aren't taking advantage of that in manually vectorized algorithms. See #4536 - attempting to maintain distinct codepaths for SSE2 and SSE4.2 was extremely difficult and we no longer take that risk.

We can also drop test coverage that disables SSE2 (in a partial, simulated way), because we'll never run on such processors.

Finally, I don't think we need to bother testing GH_000935_complex_numerical_accuracy with /arch:IA32. The STL's headers aren't blocking the option, so while users must be running SSE2-capable processors, they can still limit their own codegen to IA32 (unless and until the compiler deprecates and removes the option, of which I am aware of no plans). However, I think this option is sufficiently obscure that we don't need to bother testing it, and we haven't had any bugs involving it either.

@StephanTLavavej StephanTLavavej merged commit 881bcad into microsoft:main Jun 21, 2024
39 checks passed
@StephanTLavavej StephanTLavavej deleted the ia32 branch June 21, 2024 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Should we require SSE2? Support for CPUs which do not have SSE2 extensions?
2 participants