Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize division by constant after AND or RSZ #55778

Merged
merged 2 commits into from
Oct 4, 2021
Merged

Conversation

pentp
Copy link
Contributor

@pentp pentp commented Jul 15, 2021

If the dividend operand is AND or RSZ with a constant then the number of input bits can be reduced which helps generate more optimal magic numbers.
Also adopted a couple more functions to use unsigned division.

Diff example:

 G_M59169_IG02:
        48B8FFFFFFFFFFFFFF3F mov      rax, 0x3FFFFFFFFFFFFFFF
        488BD0               mov      rdx, rax
        482311               and      rdx, qword ptr [rcx]
        48B94B598638D6C56D34 mov      rcx, 0x346DC5D63886594B
        488BC2               mov      rax, rdx
        48F7E1               mul      rdx:rax, rcx
        488BCA               mov      rcx, rdx
        48C1E90B             shr      rcx, 11
-       48BACFF753E3A59BC420 mov      rdx, 0x20C49BA5E353F7CF
+       48BAF0A7C64B37894100 mov      rdx, 0x4189374BC6A7F0
        488BC1               mov      rax, rcx
-       48C1E803             shr      rax, 3
        48F7E2               mul      rdx:rax, rdx
-       48C1EA04             shr      rdx, 4
        4869C2E8030000       imul     rax, rdx, 0x3E8
        482BC8               sub      rcx, rax
        8BC1                 mov      eax, ecx
-						;; bbWeight=1    PerfScore 13.75
+						;; bbWeight=1    PerfScore 12.75
 G_M59169_IG03:
        C3                   ret      
 						;; bbWeight=1    PerfScore 1.00
 
-; Total bytes of code 76, prolog size 0, PerfScore 22.35, instruction count 17, allocated bytes for code 76 (MethodHash=439618de) for method DateTime:get_Millisecond():int:this
+; Total bytes of code 68, prolog size 0, PerfScore 20.55, instruction count 15, allocated bytes for code 68 (MethodHash=439618de) for method DateTime:get_Millisecond():int:this

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

@marek-safar marek-safar added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 16, 2021
@pentp
Copy link
Contributor Author

pentp commented Jul 28, 2021

Moved inlining changes to separate PR #56466

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes looks good. Thank you!
Did you run superpmi and can you share the diff numbers across benchmarks/libraries/asp?

src/coreclr/jit/lower.cpp Show resolved Hide resolved
src/coreclr/jit/utils.h Outdated Show resolved Hide resolved
@@ -28,7 +28,7 @@ namespace System
// Maps to December 31 year 9999. The value calculated from "new DateTime(9999, 12, 31).Ticks / TimeSpan.TicksPerDay"
private const int MaxDayNumber = 3_652_058;

private static int DayNumberFromDateTime(DateTime dt) => (int)(dt.Ticks / TimeSpan.TicksPerDay);
private static int DayNumberFromDateTime(DateTime dt) => (int)((ulong)dt.Ticks / TimeSpan.TicksPerDay);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this cast help?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It changes the division from signed to unsigned.

Copy link
Contributor

@SingleAccretion SingleAccretion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possible source of reducing bits would be small types (don't think it is worth doing it now though, you could leave a TODO for someone to take a second look if/after #55186 is merged).

src/coreclr/jit/lower.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/lower.cpp Show resolved Hide resolved
@pentp
Copy link
Contributor Author

pentp commented Oct 4, 2021

PMI diffs for System.Private.CoreLib:

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 5563217
Total bytes of diff: 5563112
Total bytes of delta: -105 (-0,00% of base)
Total relative delta: -0,67
    diff is an improvement.
    relative diff is an improvement.
Detailed diffs for win-x64
Top method improvements (bytes):
         -16 (-0,88% of base) : System.Private.CoreLib.dasm - StringSerializer:GetNextTransitionTimeValue():TransitionTime:this
          -9 (-0,13% of base) : System.Private.CoreLib.dasm - DateTimeFormat:FormatCustomized(DateTime,ReadOnlySpan`1,DateTimeFormatInfo,TimeSpan,StringBuilder):StringBuilder
          -8 (-9,52% of base) : System.Private.CoreLib.dasm - Calendar:GetMilliseconds(DateTime):double:this
          -8 (-10,53% of base) : System.Private.CoreLib.dasm - DateTime:get_Millisecond():int:this
          -8 (-9,30% of base) : System.Private.CoreLib.dasm - DateTimeOffset:get_Millisecond():int:this
          -8 (-0,40% of base) : System.Private.CoreLib.dasm - TimeZoneInfo:TransitionTimeFromTimeZoneInformation(byref,byref,bool):bool
          -4 (-6,06% of base) : System.Private.CoreLib.dasm - Calendar:GetMinute(DateTime):int:this
          -4 (-6,06% of base) : System.Private.CoreLib.dasm - Calendar:GetSecond(DateTime):int:this
          -4 (-5,80% of base) : System.Private.CoreLib.dasm - DateTime:get_Minute():int:this
          -4 (-5,80% of base) : System.Private.CoreLib.dasm - DateTime:get_Second():int:this
          -4 (-0,54% of base) : System.Private.CoreLib.dasm - DateTimeOffset:.ctor(int,int,int,int,int,int,int,Calendar,TimeSpan):this
          -4 (-0,44% of base) : System.Private.CoreLib.dasm - DateTimeOffset:.ctor(int,int,int,int,int,int,int,TimeSpan):this
          -4 (-0,45% of base) : System.Private.CoreLib.dasm - DateTimeOffset:.ctor(int,int,int,int,int,int,TimeSpan):this
          -4 (-5,06% of base) : System.Private.CoreLib.dasm - DateTimeOffset:get_Minute():int:this
          -4 (-5,06% of base) : System.Private.CoreLib.dasm - DateTimeOffset:get_Second():int:this
          -4 (-0,38% of base) : System.Private.CoreLib.dasm - EastAsianLunisolarCalendar:ToDateTime(int,int,int,int,int,int,int,int):DateTime:this
          -4 (-0,51% of base) : System.Private.CoreLib.dasm - GregorianCalendar:ToDateTime(int,int,int,int,int,int,int,int):DateTime:this
          -4 (-0,40% of base) : System.Private.CoreLib.dasm - HebrewCalendar:HebrewToGregorian(int,int,int,int,int,int,int):DateTime

Top method improvements (percentages):
          -8 (-10,53% of base) : System.Private.CoreLib.dasm - DateTime:get_Millisecond():int:this
          -8 (-9,52% of base) : System.Private.CoreLib.dasm - Calendar:GetMilliseconds(DateTime):double:this
          -8 (-9,30% of base) : System.Private.CoreLib.dasm - DateTimeOffset:get_Millisecond():int:this
          -4 (-6,06% of base) : System.Private.CoreLib.dasm - Calendar:GetMinute(DateTime):int:this
          -4 (-6,06% of base) : System.Private.CoreLib.dasm - Calendar:GetSecond(DateTime):int:this
          -4 (-5,80% of base) : System.Private.CoreLib.dasm - DateTime:get_Minute():int:this
          -4 (-5,80% of base) : System.Private.CoreLib.dasm - DateTime:get_Second():int:this
          -4 (-5,06% of base) : System.Private.CoreLib.dasm - DateTimeOffset:get_Minute():int:this
          -4 (-5,06% of base) : System.Private.CoreLib.dasm - DateTimeOffset:get_Second():int:this
         -16 (-0,88% of base) : System.Private.CoreLib.dasm - StringSerializer:GetNextTransitionTimeValue():TransitionTime:this
          -4 (-0,54% of base) : System.Private.CoreLib.dasm - DateTimeOffset:.ctor(int,int,int,int,int,int,int,Calendar,TimeSpan):this
          -4 (-0,51% of base) : System.Private.CoreLib.dasm - GregorianCalendar:ToDateTime(int,int,int,int,int,int,int,int):DateTime:this
          -4 (-0,45% of base) : System.Private.CoreLib.dasm - DateTimeOffset:.ctor(int,int,int,int,int,int,TimeSpan):this
          -4 (-0,44% of base) : System.Private.CoreLib.dasm - DateTimeOffset:.ctor(int,int,int,int,int,int,int,TimeSpan):this
          -4 (-0,40% of base) : System.Private.CoreLib.dasm - HebrewCalendar:HebrewToGregorian(int,int,int,int,int,int,int):DateTime
          -8 (-0,40% of base) : System.Private.CoreLib.dasm - TimeZoneInfo:TransitionTimeFromTimeZoneInformation(byref,byref,bool):bool
          -4 (-0,38% of base) : System.Private.CoreLib.dasm - EastAsianLunisolarCalendar:ToDateTime(int,int,int,int,int,int,int,int):DateTime:this
          -9 (-0,13% of base) : System.Private.CoreLib.dasm - DateTimeFormat:FormatCustomized(DateTime,ReadOnlySpan`1,DateTimeFormatInfo,TimeSpan,StringBuilder):StringBuilder

18 total methods with Code Size differences (18 improved, 0 regressed), 26172 unchanged.

@JulieLeeMSFT JulieLeeMSFT added this to the 7.0.0 milestone Oct 4, 2021
Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kunalspathak kunalspathak merged commit 7afb662 into dotnet:main Oct 4, 2021
@pentp pentp deleted the idiv branch October 5, 2021 07:59
@ghost ghost locked as resolved and limited conversation to collaborators Nov 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants