Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider detecting and optimizing common range check patterns #13347

Closed
VSadov opened this issue Jun 21, 2017 · 14 comments
Closed

Consider detecting and optimizing common range check patterns #13347

VSadov opened this issue Jun 21, 2017 · 14 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Comments

@VSadov
Copy link
Member

VSadov commented Jun 21, 2017

It is very common to see code such as:
(num >= '0' && num <= '9')

The efficient way to emit such checks is:
(uint)(num - '0') <= ('9' '- '0')

That often comes up in code reviews (see:dotnet/corefxlab#1616 )

It should not be too hard to handle this kind of strength reduction in th ecompiler.

@VSadov VSadov changed the title Consider detecting and optimizing common range check paterns Consider detecting and optimizing common range check patterns Jun 21, 2017
@jkotas
Copy link
Member

jkotas commented Jun 21, 2017

Or do this in the dedicated IL optimizer: dotnet/roslyn#15929

@sharwell
Copy link
Member

sharwell commented Jun 21, 2017

If you have gaps in the range but the total span can fit in a mask for the integer type, you can use this (showing the case for [0-46-9]):

((1 << (num - '0')) & 0x3DF) != 0

We used this in antlr/antlr4#89.

@mikedn
Copy link
Contributor

mikedn commented Jun 22, 2017

If you have gaps in the range but the total span can fit in a mask for the integer type

And of course, it works well if it fits in a long and you're running on 64 bit. The trouble with doing IL optimizations is that you can't be sure what CPU the code will end up running on.

@alrz
Copy link

alrz commented Jun 24, 2017

Is this really going to be in the compiler or we should regex the heck out of the source?

@mikedn
Copy link
Contributor

mikedn commented Jun 24, 2017

we should regex the heck out of the source?

LOL. IMO it should be in the JIT compiler.

@alrz
Copy link

alrz commented Jun 25, 2017

It's interesting that gcc does this no matter how you write it,

bool f(char c) {
   switch (c) { case '0': case '1': return true; }
   return false;
}
bool f(char c) {
   return c == '0' || c == '1';
}

try on https://godbolt.org.

@mikedn
Copy link
Contributor

mikedn commented Jun 25, 2017

It's interesting that gcc does this no matter how you write it,

Yes, and all other C/C++ compilers. These optimizations (range check, the sparse range check mentioned by @sharwell and some forms of switches like the one in your example) work together. The compiler is supposed to recognize all these is var x in set of constants/ranges patterns and generate efficient code.

@gafter
Copy link
Member

gafter commented Nov 19, 2017

We are considering adding a range pattern, e.g. (num is '0'..'9'). If we do that, we would generate "optimized" code for the range pattern from the start.

@GSPP
Copy link

GSPP commented Aug 30, 2019

MSVC can emit a bit test x86 instruction for this. That would be available to the JIT or as an intrinsic to the C# compiler (assuming that it is OK to take a dependency on something like that).

@mikedn
Copy link
Contributor

mikedn commented Aug 30, 2019

FWIW the JIT already emits BT for certain switch instructions:

            switch (i)
            {
            case 3:
            case 5:
            case 8:
            case 9:
            case 12:
            case 13:
                Console.WriteLine("BT was here!");
                break;
            default:
                Console.WriteLine("BT was not here!");
                break;
            }

generates:

G_M15779_IG02:
       83C1FD               add      ecx, -3
       83F90A               cmp      ecx, 10
       7722                 ja       SHORT G_M15779_IG05
       B865060000           mov      eax, 0x665
       0FA3C8               bt       eax, ecx
       7318                 jae      SHORT G_M15779_IG05
G_M15779_IG03:
       48B99831FA1D62020000 mov      rcx, 0x2621DFA3198
       488B09               mov      rcx, gword ptr [rcx]
       E85FFCFFFF           call     System.Console:WriteLine(ref)
       90                   nop
G_M15779_IG04:
       4883C428             add      rsp, 40
       C3                   ret
G_M15779_IG05:
       48B9A031FA1D62020000 mov      rcx, 0x2621DFA31A0
       488B09               mov      rcx, gword ptr [rcx]
       E847FCFFFF           call     System.Console:WriteLine(ref)
       90                   nop
G_M15779_IG06:

@gafter
Copy link
Member

gafter commented Aug 31, 2019

Moving to the coreclr so it can be done for all languages.

@gafter gafter transferred this issue from dotnet/roslyn Aug 31, 2019
@mikedn
Copy link
Contributor

mikedn commented Sep 2, 2019

Duplicate of #8418

@BruceForstall
Copy link
Member

Closing as dup

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@EgorBo
Copy link
Member

EgorBo commented Apr 4, 2020

@gafter @BruceForstall actually it makes sense to do in Roslyn too

static bool Foo1(char num) => (num >= '0' && num <= '9');

RyuJIT imports it as 3 basic-blocks:

*  JTRUE     void  
\--*  LT        int   
   +--*  LCL_VAR   ushort V00 arg0         
   \--*  CNS_INT   int    48


*  RETURN    int   
\--*  CNS_INT   int    0


*  RETURN    int   
\--*  EQ        int   
   +--*  GT        int   
   |  +--*  LCL_VAR   ushort V00 arg0         
   |  \--*  CNS_INT   int    57
   \--*  CNS_INT   int    0

And if you optimize it in roslyn it will be a single-bb during RyuJIT import phase, e.g.:

static bool Foo2(char num) => (uint)(num - '0') <= ('9' - '0');
*  RETURN    int   
\--*  EQ        int   
   +--*  GT        int   
   |  +--*  SUB       int   
   |  |  +--*  LCL_VAR   ushort V00 arg0         
   |  |  \--*  CNS_INT   int    48
   |  \--*  CNS_INT   int    9
   \--*  CNS_INT   int    0

RyuJIT's inliner simply gives up if a method has more than 5 basic-blocks.

Just check this example: EgorBo@b7f15b7

Also it'd make IL code (= dll size) smaller as a bonus.

@ghost ghost locked as resolved and limited conversation to collaborators Dec 22, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

No branches or pull requests

9 participants