-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RyuJIT does not optimize multiple integer equality comparisons #81668
Comments
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak Issue DetailsDescriptionCurrently, RyuJIT does not optimize a sequence of integer equality comparisons all leading to the same path. static bool IsVowel(char c)
{
return c == 'a' ||
c == 'e' ||
c == 'i' ||
c == 'o' ||
c == 'u' ||
c == 'A' ||
c == 'E' ||
c == 'I' ||
c == 'O' ||
c == 'U';
} Today, that will be compiled to a long sequence of instructions, doing each comparison and jumping if equal. push rbp
mov rbp, rsp
movzx rax, di
cmp eax, 97
je SHORT G_M5371_IG05
cmp eax, 101
je SHORT G_M5371_IG05
cmp eax, 105
je SHORT G_M5371_IG05
cmp eax, 111
je SHORT G_M5371_IG05
cmp eax, 117
je SHORT G_M5371_IG05
cmp eax, 65
je SHORT G_M5371_IG05
cmp eax, 69
je SHORT G_M5371_IG05
cmp eax, 73
je SHORT G_M5371_IG05
cmp eax, 79
je SHORT G_M5371_IG05
cmp eax, 85
sete al
movzx rax, al
pop rbp
ret
G_M5371_IG05:
mov eax, 1
pop rbp
ret However there is an optimization that native compilers will usually do here. Because the values being compared here are not too far apart, it's possible to construct a bit mask where the bit corresponding to the number of each vowel character is set to 1. Then one can just subtract the value being checked ( This is the code that GCC produces using the technique above for similar code written in C, with O1 optimization level: lea ecx, [rdi-65]
cmp cl, 52
ja .L3
movabs rax, 4575140898685201
shr rax, cl
and eax, 1
ret
.L3:
mov eax, 0
ret It's worth noting that it produces slightly different code at O2 (using Comparison using Godbolt compiler explorer: https://godbolt.org/z/1efMGq3rx This pattern is pretty common, so it would be nice to optimize it. For example, when working with textures, I often need to write code like: if (format == TextureFormat.D24UnormS8Uint ||
format == TextureFormat.D32FloatS8Uint ||
format == TextureFormat.D24UnormX8 ||
format == TextureFormat.D16Unorm ||
format == TextureFormat.D32Float)
{
// Do something with the depth texture...
}
else
{
// Do something with the color texture...
} Configuration.NET 7.0.100, AFAIK it affects all platforms. Regression?No, to my knowledge, this optimization was never implemented on .NET JIT compiler.
|
Duplicates #8418 NOTE: this is probably easier to implement in Roslyn/post-IL process to emit CEE_SWITCH opcode which is lowered into a bit-test or jump-table by JIT. But we can also do it in JIT at some point |
Sorry, I did not know how this optimization is called, so I didn't know what exactly to search for. static bool IsVowelSwitch(char c)
{
switch (c)
{
case 'a':
case 'e':
case 'i':
case 'o':
case 'u':
case 'A':
case 'E':
case 'I':
case 'O':
case 'U':
return true;
}
return false;
} Looking at the IL, it seems that Roslyn is generating those comparisons, I wonder why. |
Roslyn is a bit conservative for switch opcode - if the input values have too big gaps (in value) - it gives up. E.g. this one works: https://sharplab.io/#v2:C4LglgNgPgAgTARgLACgYGYAE9MGFMDeqmJ2CAbJgEYD2NEmAkgM4BqNA7gKYQDKHYYAGMAFgApRAQwBOmIQEpipIilJrMzAcJGYJi1epIrDhoZOZdMAcklWQSk6TMXrVOw8dzzlq0PcHPLxcrABN/QJJnHy5wiKjrEVjA+KswJIiSGAB2TGBpAFcuAG4PNQBfVFLSbMwAM0kICxKAzAqUMqA=== |
Closed as a dup of #8418 |
Description
Currently, RyuJIT does not optimize a sequence of integer equality comparisons all leading to the same path.
For example, let's say one wants to implement a function that determines if a given
char
is a vowel, and returnbool
.One could naively implement it like so:
Today, that will be compiled to a long sequence of instructions, doing each comparison and jumping if equal.
However there is an optimization that native compilers will usually do here. Because the values being compared here are not too far apart, it's possible to construct a bit mask where the bit corresponding to the number of each vowel character is set to 1. Then one can just subtract the value being checked (
c
) by the lowest possible value (in this caseA
which is the vowel which the lowest value). If the result is negative, we already know it's out of range, and we can return false. If it's positive, and lower or equal to the highest value (in this caseu
), we can then check if the respective bit is set on the mask. If it's set we return true, otherwise false.This is the code that GCC produces using the technique above for similar code written in C, with O1 optimization level:
It's worth noting that it produces slightly different code at O2 (using
bt
instructions instead of theshr
andand
).Other compilers like clang and MSVC produces similar code with optimizations enabled.
Comparison using Godbolt compiler explorer: https://godbolt.org/z/1efMGq3rx
RyuJIT codegen on Godbolt compiler explorer: https://godbolt.org/z/eqxnPaKGr
Above I also included a
IsVowelOptimized
function where I manually optimized it using the technique above, which produces more or less the ideal code, which I would expect to be produced for theIsVowel
function.This pattern is pretty common, so it would be nice to optimize it. For example, when working with textures, I often need to write code like:
Configuration
.NET 7.0.100, AFAIK it affects all platforms.
Regression?
No, to my knowledge, this optimization was never implemented on .NET JIT compiler.
The text was updated successfully, but these errors were encountered: