Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize lsra for MinOpts #96386

Merged
merged 40 commits into from
Jan 26, 2024
Merged

Optimize lsra for MinOpts #96386

merged 40 commits into from
Jan 26, 2024

Conversation

kunalspathak
Copy link
Member

@kunalspathak kunalspathak commented Jan 1, 2024

  • Created separate methods *Minimal() and moved the enregisterLocalVars=false specific code in it
  • Only run REG_ORDER and REG_NUM heuristics when enregisterLocalVars=false

@ghost ghost assigned kunalspathak Jan 1, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 1, 2024
@ghost
Copy link

ghost commented Jan 1, 2024

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

null

Author: kunalspathak
Assignees: kunalspathak
Labels:

area-CodeGen-coreclr

Milestone: -

@kunalspathak
Copy link
Member Author

Need to find out what is going on with arm64, but initial results are very motivating!

image

image

@kunalspathak
Copy link
Member Author

I tried to investigate the windows-arm64 TP regression

image

It turns out that the majority of slowness is attributed to 2 method contexts of benchmarks.run.windows.arm64.checked.mch : 2999 and 15658 and the reason being the majority of gains from register allocation are compensated by the equal work that emitter has to do to produce more instructions. Below is the pin difference for just 2 method contexts.

Base: 414060, Diff: 433023, +4.5798%

??$allocateReg@$0A@@LinearScan@@AEAA?AW4_regNumber_enum@@PEAVInterval@@PEAVRefPosition@@@Z : 24193  : +942.46% : 13.83% : +5.8429%
??$allocateRegistersForMinOpt@$0A@@LinearScan@@QEAAXXZ                                     : 22858  : NA       : 13.07% : +5.5205%
?emitOutputInstr@emitter@@IEAA_KPEAUinsGroup@@PEAUinstrDesc@1@PEAPEAE@Z                    : 8062   : +61.79%  : 4.61%  : +1.9471%
?lvaFrameAddress@Compiler@@QEAAHHPEA_N@Z                                                   : 6491   : +575.44% : 3.71%  : +1.5676%
?emitAllocAnyInstr@emitter@@AEAAPEAX_KW4emitAttr@@@Z                                       : 2664   : +45.73%  : 1.52%  : +0.6434%
?emitIns_S_R@emitter@@QEAAXW4instruction@@W4emitAttr@@W4_regNumber_enum@@HH@Z              : 2568   : +143.87% : 1.47%  : +0.6202%
?rsSpillTree@RegSet@@AEAAXW4_regNumber_enum@@PEAUGenTree@@I@Z                              : 2544   : NA       : 1.45%  : +0.6144%
?emitIns_R_S@emitter@@QEAAXW4instruction@@W4emitAttr@@W4_regNumber_enum@@HH@Z              : 2391   : +127.05% : 1.37%  : +0.5775%
?IsMultiRegNode@GenTree@@QEBA_NXZ                                                          : 1988   : +177.03% : 1.14%  : +0.4801%
?emitInsCode@emitter@@AEAAIW4instruction@@W4insFormat@1@@Z                                 : 1496   : +65.99%  : 0.86%  : +0.3613%
?idCodeSize@instrDesc@emitter@@QEBAIXZ                                                     : 1290   : +45.94%  : 0.74%  : +0.3115%
?updateMaxSpill@LinearScan@@QEAAXPEAVRefPosition@@@Z                                       : 1267   : +54.61%  : 0.72%  : +0.3060%
?resetAllRegistersState@LinearScan@@AEAAXXZ                                                : 1255   : NA       : 0.72%  : +0.3031%
?unassignPhysReg@LinearScan@@AEAAXPEAVRegRecord@@PEAVRefPosition@@@Z                       : 1232   : +47.26%  : 0.70%  : +0.2975%
memset                                                                                     : 1218   : +44.71%  : 0.70%  : +0.2942%
?initLclVarAddr@emitLclVarAddr@@QEAAXHI@Z                                                  : 1075   : +139.97% : 0.61%  : +0.2596%
?emitSizeOfInsDsc@emitter@@AEBA_KPEAUinstrDesc@1@@Z                                        : 1056   : +106.88% : 0.60%  : +0.2550%
?genUnspillRegIfNeeded@CodeGen@@IEAAXPEAUGenTree@@@Z                                       : 996    : +101.01% : 0.57%  : +0.2405%
?insertCopyOrReload@LinearScan@@QEAAXPEAUBasicBlock@@PEAUGenTree@@IPEAVRefPosition@@@Z     : 935    : NA       : 0.53%  : +0.2258%
?emitEndCodeGen@emitter@@QEAAIPEAVCompiler@@_N11IPEAI2PEAPEAX33@Z                          : 903    : +26.23%  : 0.52%  : +0.2181%
?updateAssignedInterval@LinearScan@@AEAAXPEAVRegRecord@@PEAVInterval@@@Z                   : 777    : +35.41%  : 0.44%  : +0.1877%
?appendToCurIG@emitter@@IEAAXPEAUinstrDesc@1@@Z                                            : 731    : +52.18%  : 0.42%  : +0.1765%
?rsUnspillInPlace@RegSet@@AEAAPEAVTempDsc@@PEAUGenTree@@W4_regNumber_enum@@I@Z             : 711    : NA       : 0.41%  : +0.1717%
?assignPhysReg@LinearScan@@AEAAXPEAVRegRecord@@PEAVInterval@@@Z                            : 693    : +36.21%  : 0.40%  : +0.1674%
?emitFindOffset@emitter@@IEBAIPEBUinsGroup@@I@Z                                            : 672    : +72.41%  : 0.38%  : +0.1623%
?insEncodeDatasizeLS@emitter@@CAIIW4emitAttr@@@Z                                           : 598    : +120.56% : 0.34%  : +0.1444%
?tmpRlsTemp@RegSet@@QEAAXPEAVTempDsc@@@Z                                                   : 588    : NA       : 0.34%  : +0.1420%
?TryGetUse@Range@LIR@@QEAA_NPEAUGenTree@@PEAVUse@2@@Z                                      : 539    : NA       : 0.31%  : +0.1302%
?genCodeForTreeNode@CodeGen@@IEAAXPEAUGenTree@@@Z                                          : 451    : +9.59%   : 0.26%  : +0.1089%
?emitGCregDeadUpd@emitter@@QEAAXW4_regNumber_enum@@PEAE@Z                                  : 402    : +33.67%  : 0.23%  : +0.0971%
??$resolveRegisters@$0A@@LinearScan@@QEAAXXZ                                               : 396    : +4.76%   : 0.23%  : +0.0956%
?allocateMemory@ArenaAllocator@@QEAAPEAX_K@Z                                               : 368    : +3.28%   : 0.21%  : +0.0889%
?emitNewInstrCns@emitter@@AEAAPEAUinstrDesc@1@W4emitAttr@@_J@Z                             : 357    : +87.50%  : 0.20%  : +0.0862%
?tmpSlot@RegSet@@CAII@Z                                                                    : 286    : NA       : 0.16%  : +0.0691%
?setIntervalAsSpilled@LinearScan@@AEAAXPEAVInterval@@@Z                                    : 252    : NA       : 0.14%  : +0.0609%
?gcMakeRegPtrTable@GCInfo@@QEAAXPEAVGcInfoEncoder@@IIW4MakeRegPtrMode@1@PEAI@Z             : 244    : +10.95%  : 0.14%  : +0.0589%
?ins_Load@CodeGenInterface@@QEAA?AW4instruction@@W4var_types@@_N@Z                         : 237    : +179.55% : 0.14%  : +0.0572%
?lvaAllocateTemps@Compiler@@QEAAHH_N@Z                                                     : 221    : +153.47% : 0.13%  : +0.0534%
?TryGetUse@GenTree@@QEAA_NPEAU1@PEAPEAPEAU1@@Z                                             : 212    : NA       : 0.12%  : +0.0512%
?getWeight@LinearScan@@AEAANPEAVRefPosition@@@Z                                            : 189    : +10.41%  : 0.11%  : +0.0456%
?recordMaxSpill@LinearScan@@QEAAXXZ                                                        : 182    : +55.83%  : 0.10%  : +0.0440%
?UpdateLifeVar@?$TreeLifeUpdater@$00@@AEAAXPEAUGenTree@@PEAUGenTreeLclVarCommon@@@Z        : -175   : -21.74%  : 0.10%  : -0.0423%
?getMatchingConstants@LinearScan@@AEAA_K_KPEAVInterval@@PEAVRefPosition@@@Z                : -180   : -100.00% : 0.10%  : -0.0435%
?processBlockEndAllocation@LinearScan@@AEAAXPEAUBasicBlock@@@Z                             : -1896  : -100.00% : 1.08%  : -0.4579%
??$allocateRegisters@$0A@@LinearScan@@QEAAXXZ                                              : -30030 : -100.00% : 17.17% : -7.2526%
??$select@$0A@@RegisterSelection@LinearScan@@QEAA_KPEAVInterval@@PEAVRefPosition@@@Z       : -45417 : -100.00% : 25.97% : -10.9687%

If I exclude those 2 contexts from the benchmarks.run.windows collection, then locally I do see TP gains that matches for other platforms:

image

2999 diff and 15658 diff

@kunalspathak
Copy link
Member Author

There was a bug where I was almost always assigning busy register. That was having us generate lot of spilling and emit spilling code increase the TP. I fixed that in 7f29c4e. The new numbers are even better, especially on Arm64, we see almost 13% TP improvements. Diffs:

image

image

@kunalspathak
Copy link
Member Author

Yes, so here is the breakdown:

FullOpts:

  1. Blue - All x64 platforms is showing around 17K+ regression
  2. Green - All arm64 platforms is showing around 32K+ regression (acceptable because of arm64 larger code size)

MinOpts:

  1. Yellow - All arm64 platforms MinOpts is showing around 145K+ regression
  2. Purple - Linux/x64 platform MinOpts is showing 464K+ improvement
image

Looking in more detail, I checked what % of contents are affected in each of the configuration. For x64, there are around 28% (both windows/linux) and on arm64, there are around 15% (both windows/linux), so we at least see that the contexts affected are same for a particular architecture.

OS arch total context diff context % diff
windows x64 2384370 666648 28%
linux x64 2546536 716232 28%
windows arm64 2329306 366025 16%
linux arm64 2495514 385512 15%
windows x86 2244188 505186 23%
linux arm 2239583 671422 30%

Linux/x64 analysis

Values are mostly in rax (because of REG_ORDER), I see that instructions are more compact with it, Here is an example from benchmarks.run.linux collection

-  400FB67F4C          movzx    rdi, byte  ptr [rdi+0x4C]
+ 0FB6404C             movzx    rax, byte  ptr [rax+0x4C]

Some other places, since value is mostly in rax, I see cwde instruction is used instead of movsx which is smaller in size:

image image

That could justify the improvements on linux, but if you want I can dig further.

@jakobbotsch
Copy link
Member

jakobbotsch commented Jan 19, 2024

IMO, since the change is about MinOpts, it should be no-diff for FullOpts. How does the change end up affecting codegen for FullOpts in the cases where we have no tracking variables? Isn't that unexpected? And if it's expected, can the new logic be made to run only if OptimizationDisabled() && !enregisterLocalVar or something?

Edit: It makes sense to me that we'd still want the optimized heuristics for LIR temps, even when we don't have any locals to enregister, so I think we should base the use of the fast and less optimal heuristics around OptimizationDisabled() instead.

@kunalspathak
Copy link
Member Author

kunalspathak commented Jan 20, 2024

IMO, since the change is about MinOpts, it should be no-diff for FullOpts.

This change is not just for MinOpts, but when we decide that enregistering of local vars is not needed. This is fairly common in MinOpts/Tier0/MethodImplOptions.NoOptimization but also happen in FullOpts, when there are no tracking variables.

How does the change end up affecting codegen for FullOpts in the cases where we have no tracking variables? Isn't that unexpected?

We turn off enregisterLocalVars if we don't have any tracking variable and hence this change is affecting exactly and only those scenarios for FullOpts.

// but we don't want to spend time on the lclVar parts of LinearScan
// if we have no tracked locals.
if (enregisterLocalVars && (compiler->lvaTrackedCount == 0))
{
enregisterLocalVars = false;
}

And if it's expected, can the new logic be made to run only if OptimizationDisabled() && !enregisterLocalVar or something? Edit: It makes sense to me that we'd still want the optimized heuristics for LIR temps, even when we don't have any locals to enregister, so I think we should base the use of the fast and less optimal heuristics around OptimizationDisabled() instead.

I am not sure I follow what you are suggesting.

@jakobbotsch
Copy link
Member

We turn off enregisterLocalVars if we don't have any tracking variable and hence this change is affecting exactly and only those scenarios for FullOpts.

The end result seems to be that we are regressing CQ in FullOpts for a throughput optimization. That doesn't seem right.

enregisterLocalVars does not seem like the right check to me. Sure if there are no local vars then there are also no local vars to enregister, even in FullOpts, but that doesn't mean there are no other intervals that should get good register allocation.

I am not sure I follow what you are suggesting.

Call allocateRegistersMinimal() only if opts.OptimizationDisabled(). Optionally assert that enregisterLocalVars is false in that case. Keep allocating all intervals with the full set of heuristics in FullOpts.

@BruceForstall
Copy link
Member

If @jakobbotsch suggestion restricts the change the MinOpts, then this change could be only about MinOpts CQ + TP. @kunalspathak if you think a version of this change applies to FullOpts, then perhaps that could be made as a follow-up, to isolate the conversation about the FullOpts impact to that PR.

@kunalspathak
Copy link
Member Author

Call allocateRegistersMinimal() only if opts.OptimizationDisabled(). Optionally assert that enregisterLocalVars is false in that case. Keep allocating all intervals with the full set of heuristics in FullOpts.

ah i see. Yes, I was trying out similar change locally to eliminate the full opt impact.

@kunalspathak kunalspathak changed the title Experiment: Optimize lsra when enregisterLocalVars is false Optimize lsra for MinOpts and enregisterLocalVars is false Jan 20, 2024
@kunalspathak
Copy link
Member Author

superpmi failures are: OSError: [Errno 28] No space left on device

/azp run runtime-coreclr superpmi-replay

@kunalspathak
Copy link
Member Author

/azp run runtime-coreclr superpmi-replay

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@kunalspathak kunalspathak changed the title Optimize lsra for MinOpts and enregisterLocalVars is false Optimize lsra for MinOpts Jan 22, 2024
@kunalspathak
Copy link
Member Author

image
image
image

@kunalspathak
Copy link
Member Author

Assembly diffs

Assembly diffs for linux/arm64 ran on windows/x64

Diffs are based on ${\color{orange}2,491,707}$ contexts (${\color{orange}1,001,189}$ MinOpts, ${\color{orange}1,490,518}$ FullOpts).

${\color{blue}MISSED}$ contexts: ${\color{blue}3,807 (0.15\%)}$

Overall (${\color{red}+148,204}$ bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.linux.arm64.checked.mch 14,270,080 +0
benchmarks.run_pgo.linux.arm64.checked.mch 83,058,256 ${\color{red}+1,980}$
benchmarks.run_tiered.linux.arm64.checked.mch 21,983,044 ${\color{red}+588}$
coreclr_tests.run.linux.arm64.checked.mch 510,170,492 ${\color{red}+144,156}$
libraries.crossgen2.linux.arm64.checked.mch 55,689,604 +0
libraries.pmi.linux.arm64.checked.mch 75,931,904 +0
libraries_tests.run.linux.arm64.Release.mch 379,668,404 ${\color{red}+1,372}$
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch 162,138,568 ${\color{red}+108}$
realworld.run.linux.arm64.checked.mch 15,784,696 +0
MinOpts (${\color{red}+148,204}$ bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.linux.arm64.checked.mch 218,508 +0
benchmarks.run_pgo.linux.arm64.checked.mch 26,552,600 ${\color{red}+1,980}$
benchmarks.run_tiered.linux.arm64.checked.mch 17,400,224 ${\color{red}+588}$
coreclr_tests.run.linux.arm64.checked.mch 349,264,408 ${\color{red}+144,156}$
libraries.crossgen2.linux.arm64.checked.mch 1,636 +0
libraries.pmi.linux.arm64.checked.mch 119,984 +0
libraries_tests.run.linux.arm64.Release.mch 214,980,496 ${\color{red}+1,372}$
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch 13,480,568 ${\color{red}+108}$
realworld.run.linux.arm64.checked.mch 575,816 +0

Assembly diffs for linux/x64 ran on windows/x64

Diffs are based on ${\color{orange}2,542,496}$ contexts (${\color{orange}1,013,312}$ MinOpts, ${\color{orange}1,529,184}$ FullOpts).

${\color{blue}MISSED}$ contexts: ${\color{blue}4,042 (0.16\%)}$

Overall (${\color{green}-464,740}$ bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.linux.x64.checked.mch 13,063,008 ${\color{green}-5}$
benchmarks.run_pgo.linux.x64.checked.mch 64,981,453 ${\color{green}-15,309}$
benchmarks.run_tiered.linux.x64.checked.mch 17,221,305 ${\color{green}-11,296}$
coreclr_tests.run.linux.x64.checked.mch 439,635,202 ${\color{green}-246,288}$
libraries.crossgen2.linux.x64.checked.mch 38,636,779 ${\color{green}-4}$
libraries.pmi.linux.x64.checked.mch 59,927,414 ${\color{green}-13}$
libraries_tests.run.linux.x64.Release.mch 328,845,908 ${\color{green}-136,481}$
libraries_tests_no_tiered_compilation.run.linux.x64.Release.mch 129,732,524 ${\color{green}-54,318}$
realworld.run.linux.x64.checked.mch 13,174,701 ${\color{green}-1,026}$
smoke_tests.nativeaot.linux.x64.checked.mch 4,191,403 +0
MinOpts (${\color{green}-464,740}$ bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.linux.x64.checked.mch 230,051 ${\color{green}-5}$
benchmarks.run_pgo.linux.x64.checked.mch 18,706,882 ${\color{green}-15,309}$
benchmarks.run_tiered.linux.x64.checked.mch 13,644,734 ${\color{green}-11,296}$
coreclr_tests.run.linux.x64.checked.mch 310,584,524 ${\color{green}-246,288}$
libraries.crossgen2.linux.x64.checked.mch 1,202 ${\color{green}-4}$
libraries.pmi.linux.x64.checked.mch 112,870 ${\color{green}-13}$
libraries_tests.run.linux.x64.Release.mch 183,459,665 ${\color{green}-136,481}$
libraries_tests_no_tiered_compilation.run.linux.x64.Release.mch 10,657,089 ${\color{green}-54,318}$
realworld.run.linux.x64.checked.mch 387,256 ${\color{green}-1,026}$
smoke_tests.nativeaot.linux.x64.checked.mch 911 +0

Assembly diffs for osx/arm64 ran on windows/x64

Diffs are based on ${\color{orange}2,257,766}$ contexts (${\color{orange}928,286}$ MinOpts, ${\color{orange}1,329,480}$ FullOpts).

${\color{blue}MISSED}$ contexts: ${\color{blue}2,387 (0.11\%)}$

Overall (${\color{red}+150,496}$ bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run_pgo.osx.arm64.checked.mch 34,138,564 ${\color{red}+1,480}$
benchmarks.run_tiered.osx.arm64.checked.mch 15,442,448 ${\color{red}+316}$
coreclr_tests.run.osx.arm64.checked.mch 486,102,776 ${\color{red}+148,116}$
libraries.crossgen2.osx.arm64.checked.mch 55,570,716 +0
libraries.pmi.osx.arm64.checked.mch 79,907,992 +0
libraries_tests.run.osx.arm64.Release.mch 314,762,740 ${\color{red}+476}$
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch 159,436,088 ${\color{red}+108}$
realworld.run.osx.arm64.checked.mch 15,065,616 +0
MinOpts (${\color{red}+150,496}$ bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run_pgo.osx.arm64.checked.mch 16,484,956 ${\color{red}+1,480}$
benchmarks.run_tiered.osx.arm64.checked.mch 11,499,424 ${\color{red}+316}$
coreclr_tests.run.osx.arm64.checked.mch 332,741,972 ${\color{red}+148,116}$
libraries.crossgen2.osx.arm64.checked.mch 1,628 +0
libraries.pmi.osx.arm64.checked.mch 121,128 +0
libraries_tests.run.osx.arm64.Release.mch 203,635,640 ${\color{red}+476}$
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch 13,136,596 ${\color{red}+108}$
realworld.run.osx.arm64.checked.mch 568,396 +0

Assembly diffs for windows/arm64 ran on windows/x64

Diffs are based on ${\color{orange}2,326,811}$ contexts (${\color{orange}934,686}$ MinOpts, ${\color{orange}1,392,125}$ FullOpts).

${\color{blue}MISSED}$ contexts: ${\color{blue}2,495 (0.11\%)}$

Overall (${\color{red}+148,732}$ bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run_pgo.windows.arm64.checked.mch 45,619,340 ${\color{red}+1,376}$
benchmarks.run_tiered.windows.arm64.checked.mch 15,264,644 ${\color{red}+368}$
coreclr_tests.run.windows.arm64.checked.mch 496,117,708 ${\color{red}+144,188}$
libraries.crossgen2.windows.arm64.checked.mch 58,913,600 +0
libraries.pmi.windows.arm64.checked.mch 79,525,232 +0
libraries_tests.run.windows.arm64.Release.mch 319,612,456 ${\color{red}+2,688}$
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch 169,012,984 ${\color{red}+112}$
realworld.run.windows.arm64.checked.mch 15,917,176 +0
MinOpts (${\color{red}+148,732}$ bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run_pgo.windows.arm64.checked.mch 16,205,148 ${\color{red}+1,376}$
benchmarks.run_tiered.windows.arm64.checked.mch 11,172,336 ${\color{red}+368}$
coreclr_tests.run.windows.arm64.checked.mch 339,719,232 ${\color{red}+144,188}$
libraries.crossgen2.windows.arm64.checked.mch 1,636 +0
libraries.pmi.windows.arm64.checked.mch 119,984 +0
libraries_tests.run.windows.arm64.Release.mch 203,904,856 ${\color{red}+2,688}$
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch 13,136,180 ${\color{red}+112}$
realworld.run.windows.arm64.checked.mch 568,424 +0

Assembly diffs for windows/x64 ran on windows/x64

Diffs are based on ${\color{orange}2,495,240}$ contexts (${\color{orange}977,958}$ MinOpts, ${\color{orange}1,517,282}$ FullOpts).

${\color{blue}MISSED}$ contexts: ${\color{blue}3,546 (0.14\%)}$

Overall (${\color{green}-4,578}$ bytes)
Collection Base size (bytes) Diff size (bytes)
aspnet.run.windows.x64.checked.mch 42,070,891 ${\color{green}-3,161}$
benchmarks.run.windows.x64.checked.mch 8,666,396 ${\color{red}+2}$
benchmarks.run_pgo.windows.x64.checked.mch 35,349,995 ${\color{green}-8,059}$
benchmarks.run_tiered.windows.x64.checked.mch 12,686,993 ${\color{green}-10,190}$
coreclr_tests.run.windows.x64.checked.mch 393,136,663 ${\color{red}+155,469}$
libraries.crossgen2.windows.x64.checked.mch 39,411,196 ${\color{green}-2}$
libraries.pmi.windows.x64.checked.mch 61,131,115 ${\color{red}+2}$
libraries_tests.run.windows.x64.Release.mch 279,572,690 ${\color{green}-126,077}$
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch 133,740,839 ${\color{green}-12,568}$
realworld.run.windows.x64.checked.mch 14,188,290 ${\color{red}+6}$
smoke_tests.nativeaot.windows.x64.checked.mch 5,085,035 +0
MinOpts (${\color{green}-4,578}$ bytes)
Collection Base size (bytes) Diff size (bytes)
aspnet.run.windows.x64.checked.mch 14,658,725 ${\color{green}-3,161}$
benchmarks.run.windows.x64.checked.mch 361 ${\color{red}+2}$
benchmarks.run_pgo.windows.x64.checked.mch 14,238,248 ${\color{green}-8,059}$
benchmarks.run_tiered.windows.x64.checked.mch 9,174,699 ${\color{green}-10,190}$
coreclr_tests.run.windows.x64.checked.mch 273,514,441 ${\color{red}+155,469}$
libraries.crossgen2.windows.x64.checked.mch 1,189 ${\color{green}-2}$
libraries.pmi.windows.x64.checked.mch 113,519 ${\color{red}+2}$
libraries_tests.run.windows.x64.Release.mch 175,492,237 ${\color{green}-126,077}$
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch 10,810,850 ${\color{green}-12,568}$
realworld.run.windows.x64.checked.mch 389,705 ${\color{red}+6}$
smoke_tests.nativeaot.windows.x64.checked.mch 909 +0

Details here

Assembly diffs for linux/arm ran on windows/x86

Diffs are based on ${\color{orange}2,232,466}$ contexts (${\color{orange}824,262}$ MinOpts, ${\color{orange}1,408,204}$ FullOpts).

${\color{blue}MISSED}$ contexts: ${\color{blue}77,883 (3.37\%)}$

Overall (${\color{green}-852,700}$ bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.linux.arm.checked.mch 14,172,892 ${\color{green}-42}$
benchmarks.run_pgo.linux.arm.checked.mch 58,450,272 ${\color{green}-48,252}$
benchmarks.run_tiered.linux.arm.checked.mch 17,364,030 ${\color{green}-32,012}$
coreclr_tests.run.linux.arm.checked.mch 321,914,498 ${\color{green}-199,178}$
libraries.crossgen2.linux.arm.checked.mch 37,768,318 +0
libraries.pmi.linux.arm.checked.mch 49,500,180 ${\color{green}-280}$
libraries_tests.run.linux.arm.Release.mch 240,715,714 ${\color{green}-533,380}$
libraries_tests_no_tiered_compilation.run.linux.arm.Release.mch 92,863,410 ${\color{green}-24,910}$
realworld.run.linux.arm.checked.mch 13,608,080 ${\color{green}-14,646}$
MinOpts (${\color{green}-852,700}$ bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.linux.arm.checked.mch 334,862 ${\color{green}-42}$
benchmarks.run_pgo.linux.arm.checked.mch 11,222,492 ${\color{green}-48,252}$
benchmarks.run_tiered.linux.arm.checked.mch 7,194,320 ${\color{green}-32,012}$
coreclr_tests.run.linux.arm.checked.mch 212,840,102 ${\color{green}-199,178}$
libraries.crossgen2.linux.arm.checked.mch 1,230 +0
libraries.pmi.linux.arm.checked.mch 106,504 ${\color{green}-280}$
libraries_tests.run.linux.arm.Release.mch 122,210,132 ${\color{green}-533,380}$
libraries_tests_no_tiered_compilation.run.linux.arm.Release.mch 10,081,872 ${\color{green}-24,910}$
realworld.run.linux.arm.checked.mch 449,296 ${\color{green}-14,646}$

Assembly diffs for windows/x86 ran on windows/x86

Diffs are based on ${\color{orange}2,240,380}$ contexts (${\color{orange}789,389}$ MinOpts, ${\color{orange}1,450,991}$ FullOpts).

${\color{blue}MISSED}$ contexts: base: ${\color{blue}3,847 (0.17\%)}$, diff: ${\color{blue}49,158 (2.15\%)}$

Overall (${\color{red}+199,265}$ bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.windows.x86.checked.mch 7,037,944 ${\color{red}+2}$
benchmarks.run_pgo.windows.x86.checked.mch 43,076,375 ${\color{red}+1,814}$
benchmarks.run_tiered.windows.x86.checked.mch 8,947,008 ${\color{red}+247}$
coreclr_tests.run.windows.x86.checked.mch 305,219,044 ${\color{red}+201,221}$
libraries.crossgen2.windows.x86.checked.mch 31,623,120 ${\color{red}+3}$
libraries.pmi.windows.x86.checked.mch 48,807,890 ${\color{green}-81}$
libraries_tests.run.windows.x86.Release.mch 162,974,883 ${\color{green}-7,140}$
libraries_tests_no_tiered_compilation.run.windows.x86.Release.mch 100,565,179 ${\color{red}+3,185}$
realworld.run.windows.x86.checked.mch 11,349,106 ${\color{red}+14}$
MinOpts (${\color{red}+199,265}$ bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.windows.x86.checked.mch 279 ${\color{red}+2}$
benchmarks.run_pgo.windows.x86.checked.mch 5,966,425 ${\color{red}+1,814}$
benchmarks.run_tiered.windows.x86.checked.mch 3,848,434 ${\color{red}+247}$
coreclr_tests.run.windows.x86.checked.mch 198,156,671 ${\color{red}+201,221}$
libraries.crossgen2.windows.x86.checked.mch 1,057 ${\color{red}+3}$
libraries.pmi.windows.x86.checked.mch 95,314 ${\color{green}-81}$
libraries_tests.run.windows.x86.Release.mch 77,656,212 ${\color{green}-7,140}$
libraries_tests_no_tiered_compilation.run.windows.x86.Release.mch 7,273,446 ${\color{red}+3,185}$
realworld.run.windows.x86.checked.mch 295,700 ${\color{red}+14}$

Details here

Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

Overall (${\color{green}-4.82\%}$ to ${\color{green}-0.04\%}$)
Collection PDIFF
benchmarks.run.linux.arm64.checked.mch ${\color{green}-0.09\%}$
benchmarks.run_pgo.linux.arm64.checked.mch ${\color{green}-1.12\%}$
benchmarks.run_tiered.linux.arm64.checked.mch ${\color{green}-4.82\%}$
coreclr_tests.run.linux.arm64.checked.mch ${\color{green}-4.00\%}$
libraries.crossgen2.linux.arm64.checked.mch ${\color{green}-0.07\%}$
libraries.pmi.linux.arm64.checked.mch ${\color{green}-0.05\%}$
libraries_tests.run.linux.arm64.Release.mch ${\color{green}-2.41\%}$
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch ${\color{green}-0.30\%}$
realworld.run.linux.arm64.checked.mch ${\color{green}-0.13\%}$
smoke_tests.nativeaot.linux.arm64.checked.mch ${\color{green}-0.04\%}$
MinOpts (${\color{green}-10.51\%}$ to ${\color{green}-6.50\%}$)
Collection PDIFF
benchmarks.run.linux.arm64.checked.mch ${\color{green}-8.44\%}$
benchmarks.run_pgo.linux.arm64.checked.mch ${\color{green}-9.02\%}$
benchmarks.run_tiered.linux.arm64.checked.mch ${\color{green}-9.30\%}$
coreclr_tests.run.linux.arm64.checked.mch ${\color{green}-9.24\%}$
libraries.crossgen2.linux.arm64.checked.mch ${\color{green}-8.09\%}$
libraries.pmi.linux.arm64.checked.mch ${\color{green}-9.87\%}$
libraries_tests.run.linux.arm64.Release.mch ${\color{green}-9.06\%}$
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch ${\color{green}-8.69\%}$
realworld.run.linux.arm64.checked.mch ${\color{green}-10.51\%}$
smoke_tests.nativeaot.linux.arm64.checked.mch ${\color{green}-6.50\%}$
FullOpts (${\color{green}-0.07\%}$ to ${\color{green}-0.03\%}$)
Collection PDIFF
benchmarks.run.linux.arm64.checked.mch ${\color{green}-0.05\%}$
benchmarks.run_pgo.linux.arm64.checked.mch ${\color{green}-0.04\%}$
benchmarks.run_tiered.linux.arm64.checked.mch ${\color{green}-0.03\%}$
coreclr_tests.run.linux.arm64.checked.mch ${\color{green}-0.05\%}$
libraries.crossgen2.linux.arm64.checked.mch ${\color{green}-0.07\%}$
libraries.pmi.linux.arm64.checked.mch ${\color{green}-0.04\%}$
libraries_tests.run.linux.arm64.Release.mch ${\color{green}-0.04\%}$
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch ${\color{green}-0.05\%}$
realworld.run.linux.arm64.checked.mch ${\color{green}-0.04\%}$
smoke_tests.nativeaot.linux.arm64.checked.mch ${\color{green}-0.04\%}$

Throughput diffs for linux/x64 ran on windows/x64

Overall (${\color{green}-2.72\%}$ to ${\color{green}-0.05\%}$)
Collection PDIFF
benchmarks.run.linux.x64.checked.mch ${\color{green}-0.09\%}$
benchmarks.run_pgo.linux.x64.checked.mch ${\color{green}-0.56\%}$
benchmarks.run_tiered.linux.x64.checked.mch ${\color{green}-2.72\%}$
coreclr_tests.run.linux.x64.checked.mch ${\color{green}-2.70\%}$
libraries.crossgen2.linux.x64.checked.mch ${\color{green}-0.08\%}$
libraries.pmi.linux.x64.checked.mch ${\color{green}-0.06\%}$
libraries_tests.run.linux.x64.Release.mch ${\color{green}-1.34\%}$
libraries_tests_no_tiered_compilation.run.linux.x64.Release.mch ${\color{green}-0.21\%}$
realworld.run.linux.x64.checked.mch ${\color{green}-0.10\%}$
smoke_tests.nativeaot.linux.x64.checked.mch ${\color{green}-0.05\%}$
MinOpts (${\color{green}-6.56\%}$ to ${\color{green}-4.49\%}$)
Collection PDIFF
benchmarks.run.linux.x64.checked.mch ${\color{green}-5.03\%}$
benchmarks.run_pgo.linux.x64.checked.mch ${\color{green}-5.63\%}$
benchmarks.run_tiered.linux.x64.checked.mch ${\color{green}-5.71\%}$
coreclr_tests.run.linux.x64.checked.mch ${\color{green}-6.24\%}$
libraries.crossgen2.linux.x64.checked.mch ${\color{green}-4.49\%}$
libraries.pmi.linux.x64.checked.mch ${\color{green}-5.88\%}$
libraries_tests.run.linux.x64.Release.mch ${\color{green}-5.66\%}$
libraries_tests_no_tiered_compilation.run.linux.x64.Release.mch ${\color{green}-5.15\%}$
realworld.run.linux.x64.checked.mch ${\color{green}-6.56\%}$
smoke_tests.nativeaot.linux.x64.checked.mch ${\color{green}-4.77\%}$
FullOpts (${\color{green}-0.08\%}$ to ${\color{green}-0.04\%}$)
Collection PDIFF
benchmarks.run.linux.x64.checked.mch ${\color{green}-0.06\%}$
benchmarks.run_pgo.linux.x64.checked.mch ${\color{green}-0.04\%}$
benchmarks.run_tiered.linux.x64.checked.mch ${\color{green}-0.04\%}$
coreclr_tests.run.linux.x64.checked.mch ${\color{green}-0.06\%}$
libraries.crossgen2.linux.x64.checked.mch ${\color{green}-0.08\%}$
libraries.pmi.linux.x64.checked.mch ${\color{green}-0.06\%}$
libraries_tests.run.linux.x64.Release.mch ${\color{green}-0.05\%}$
libraries_tests_no_tiered_compilation.run.linux.x64.Release.mch ${\color{green}-0.07\%}$
realworld.run.linux.x64.checked.mch ${\color{green}-0.06\%}$
smoke_tests.nativeaot.linux.x64.checked.mch ${\color{green}-0.05\%}$

Throughput diffs for osx/arm64 ran on windows/x64

Overall (${\color{green}-4.32\%}$ to ${\color{green}-0.04\%}$)
Collection PDIFF
benchmarks.run.osx.arm64.checked.mch ${\color{green}-0.04\%}$
benchmarks.run_pgo.osx.arm64.checked.mch ${\color{green}-1.94\%}$
benchmarks.run_tiered.osx.arm64.checked.mch ${\color{green}-4.32\%}$
coreclr_tests.run.osx.arm64.checked.mch ${\color{green}-3.96\%}$
libraries.crossgen2.osx.arm64.checked.mch ${\color{green}-0.07\%}$
libraries.pmi.osx.arm64.checked.mch ${\color{green}-0.05\%}$
libraries_tests.run.osx.arm64.Release.mch ${\color{green}-3.05\%}$
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch ${\color{green}-0.30\%}$
realworld.run.osx.arm64.checked.mch ${\color{green}-0.13\%}$
MinOpts (${\color{green}-10.55\%}$ to ${\color{green}-7.99\%}$)
Collection PDIFF
benchmarks.run.osx.arm64.checked.mch ${\color{green}-9.05\%}$
benchmarks.run_pgo.osx.arm64.checked.mch ${\color{green}-9.34\%}$
benchmarks.run_tiered.osx.arm64.checked.mch ${\color{green}-9.62\%}$
coreclr_tests.run.osx.arm64.checked.mch ${\color{green}-9.13\%}$
libraries.crossgen2.osx.arm64.checked.mch ${\color{green}-7.99\%}$
libraries.pmi.osx.arm64.checked.mch ${\color{green}-9.86\%}$
libraries_tests.run.osx.arm64.Release.mch ${\color{green}-9.09\%}$
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch ${\color{green}-8.68\%}$
realworld.run.osx.arm64.checked.mch ${\color{green}-10.55\%}$
FullOpts (${\color{green}-0.07\%}$ to ${\color{green}-0.03\%}$)
Collection PDIFF
benchmarks.run.osx.arm64.checked.mch ${\color{green}-0.04\%}$
benchmarks.run_pgo.osx.arm64.checked.mch ${\color{green}-0.03\%}$
benchmarks.run_tiered.osx.arm64.checked.mch ${\color{green}-0.03\%}$
coreclr_tests.run.osx.arm64.checked.mch ${\color{green}-0.06\%}$
libraries.crossgen2.osx.arm64.checked.mch ${\color{green}-0.07\%}$
libraries.pmi.osx.arm64.checked.mch ${\color{green}-0.04\%}$
libraries_tests.run.osx.arm64.Release.mch ${\color{green}-0.04\%}$
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch ${\color{green}-0.05\%}$
realworld.run.osx.arm64.checked.mch ${\color{green}-0.04\%}$

Throughput diffs for windows/arm64 ran on windows/x64

Overall (${\color{green}-4.16\%}$ to ${\color{green}-0.04\%}$)
Collection PDIFF
benchmarks.run.windows.arm64.checked.mch ${\color{green}-0.04\%}$
benchmarks.run_pgo.windows.arm64.checked.mch ${\color{green}-1.27\%}$
benchmarks.run_tiered.windows.arm64.checked.mch ${\color{green}-4.16\%}$
coreclr_tests.run.windows.arm64.checked.mch ${\color{green}-4.01\%}$
libraries.crossgen2.windows.arm64.checked.mch ${\color{green}-0.07\%}$
libraries.pmi.windows.arm64.checked.mch ${\color{green}-0.05\%}$
libraries_tests.run.windows.arm64.Release.mch ${\color{green}-2.98\%}$
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch ${\color{green}-0.28\%}$
realworld.run.windows.arm64.checked.mch ${\color{green}-0.13\%}$
smoke_tests.nativeaot.windows.arm64.checked.mch ${\color{green}-0.04\%}$
MinOpts (${\color{green}-10.55\%}$ to ${\color{green}-6.32\%}$)
Collection PDIFF
benchmarks.run.windows.arm64.checked.mch ${\color{green}-9.05\%}$
benchmarks.run_pgo.windows.arm64.checked.mch ${\color{green}-9.25\%}$
benchmarks.run_tiered.windows.arm64.checked.mch ${\color{green}-9.57\%}$
coreclr_tests.run.windows.arm64.checked.mch ${\color{green}-9.25\%}$
libraries.crossgen2.windows.arm64.checked.mch ${\color{green}-8.09\%}$
libraries.pmi.windows.arm64.checked.mch ${\color{green}-9.86\%}$
libraries_tests.run.windows.arm64.Release.mch ${\color{green}-9.07\%}$
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch ${\color{green}-8.68\%}$
realworld.run.windows.arm64.checked.mch ${\color{green}-10.55\%}$
smoke_tests.nativeaot.windows.arm64.checked.mch ${\color{green}-6.32\%}$
FullOpts (${\color{green}-0.07\%}$ to ${\color{green}-0.03\%}$)
Collection PDIFF
benchmarks.run.windows.arm64.checked.mch ${\color{green}-0.04\%}$
benchmarks.run_pgo.windows.arm64.checked.mch ${\color{green}-0.03\%}$
benchmarks.run_tiered.windows.arm64.checked.mch ${\color{green}-0.03\%}$
coreclr_tests.run.windows.arm64.checked.mch ${\color{green}-0.06\%}$
libraries.crossgen2.windows.arm64.checked.mch ${\color{green}-0.07\%}$
libraries.pmi.windows.arm64.checked.mch ${\color{green}-0.04\%}$
libraries_tests.run.windows.arm64.Release.mch ${\color{green}-0.04\%}$
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch ${\color{green}-0.05\%}$
realworld.run.windows.arm64.checked.mch ${\color{green}-0.04\%}$
smoke_tests.nativeaot.windows.arm64.checked.mch ${\color{green}-0.04\%}$

Throughput diffs for windows/x64 ran on windows/x64

Overall (${\color{green}-2.54\%}$ to ${\color{green}-0.08\%}$)
Collection PDIFF
aspnet.run.windows.x64.checked.mch ${\color{green}-0.66\%}$
benchmarks.run.windows.x64.checked.mch ${\color{green}-0.08\%}$
benchmarks.run_pgo.windows.x64.checked.mch ${\color{green}-0.83\%}$
benchmarks.run_tiered.windows.x64.checked.mch ${\color{green}-2.22\%}$
coreclr_tests.run.windows.x64.checked.mch ${\color{green}-2.54\%}$
libraries.crossgen2.windows.x64.checked.mch ${\color{green}-0.12\%}$
libraries.pmi.windows.x64.checked.mch ${\color{green}-0.10\%}$
libraries_tests.run.windows.x64.Release.mch ${\color{green}-1.64\%}$
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch ${\color{green}-0.25\%}$
realworld.run.windows.x64.checked.mch ${\color{green}-0.13\%}$
smoke_tests.nativeaot.windows.x64.checked.mch ${\color{green}-0.08\%}$
MinOpts (${\color{green}-6.49\%}$ to ${\color{green}-4.45\%}$)
Collection PDIFF
aspnet.run.windows.x64.checked.mch ${\color{green}-5.50\%}$
benchmarks.run.windows.x64.checked.mch ${\color{green}-4.89\%}$
benchmarks.run_pgo.windows.x64.checked.mch ${\color{green}-5.70\%}$
benchmarks.run_tiered.windows.x64.checked.mch ${\color{green}-5.72\%}$
coreclr_tests.run.windows.x64.checked.mch ${\color{green}-5.95\%}$
libraries.crossgen2.windows.x64.checked.mch ${\color{green}-4.45\%}$
libraries.pmi.windows.x64.checked.mch ${\color{green}-5.72\%}$
libraries_tests.run.windows.x64.Release.mch ${\color{green}-5.74\%}$
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch ${\color{green}-5.52\%}$
realworld.run.windows.x64.checked.mch ${\color{green}-6.49\%}$
smoke_tests.nativeaot.windows.x64.checked.mch ${\color{green}-4.70\%}$
FullOpts (${\color{green}-0.12\%}$ to ${\color{green}-0.07\%}$)
Collection PDIFF
aspnet.run.windows.x64.checked.mch ${\color{green}-0.07\%}$
benchmarks.run.windows.x64.checked.mch ${\color{green}-0.08\%}$
benchmarks.run_pgo.windows.x64.checked.mch ${\color{green}-0.07\%}$
benchmarks.run_tiered.windows.x64.checked.mch ${\color{green}-0.07\%}$
coreclr_tests.run.windows.x64.checked.mch ${\color{green}-0.11\%}$
libraries.crossgen2.windows.x64.checked.mch ${\color{green}-0.12\%}$
libraries.pmi.windows.x64.checked.mch ${\color{green}-0.09\%}$
libraries_tests.run.windows.x64.Release.mch ${\color{green}-0.08\%}$
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch ${\color{green}-0.11\%}$
realworld.run.windows.x64.checked.mch ${\color{green}-0.10\%}$
smoke_tests.nativeaot.windows.x64.checked.mch ${\color{green}-0.08\%}$

Details here

Throughput diffs for linux/arm64 ran on linux/x64

Overall (${\color{green}-5.32\%}$ to ${\color{red}+0.04\%}$)
Collection PDIFF
libraries.crossgen2.linux.arm64.checked.mch ${\color{red}+0.04\%}$
libraries_tests.run.linux.arm64.Release.mch ${\color{green}-2.61\%}$
realworld.run.linux.arm64.checked.mch ${\color{green}-0.07\%}$
coreclr_tests.run.linux.arm64.checked.mch ${\color{green}-4.33\%}$
benchmarks.run_pgo.linux.arm64.checked.mch ${\color{green}-1.17\%}$
libraries.pmi.linux.arm64.checked.mch ${\color{red}+0.03\%}$
benchmarks.run_tiered.linux.arm64.checked.mch ${\color{green}-5.32\%}$
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch ${\color{green}-0.24\%}$
smoke_tests.nativeaot.linux.arm64.checked.mch ${\color{red}+0.04\%}$
MinOpts (${\color{green}-11.63\%}$ to ${\color{green}-6.46\%}$)
Collection PDIFF
libraries.crossgen2.linux.arm64.checked.mch ${\color{green}-9.09\%}$
libraries_tests.run.linux.arm64.Release.mch ${\color{green}-9.98\%}$
realworld.run.linux.arm64.checked.mch ${\color{green}-11.63\%}$
coreclr_tests.run.linux.arm64.checked.mch ${\color{green}-9.71\%}$
benchmarks.run_pgo.linux.arm64.checked.mch ${\color{green}-9.97\%}$
libraries.pmi.linux.arm64.checked.mch ${\color{green}-11.20\%}$
benchmarks.run.linux.arm64.checked.mch ${\color{green}-9.20\%}$
benchmarks.run_tiered.linux.arm64.checked.mch ${\color{green}-10.35\%}$
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch ${\color{green}-10.06\%}$
smoke_tests.nativeaot.linux.arm64.checked.mch ${\color{green}-6.46\%}$
FullOpts (${\color{red}+0.02\%}$ to ${\color{red}+0.05\%}$)
Collection PDIFF
libraries.crossgen2.linux.arm64.checked.mch ${\color{red}+0.04\%}$
libraries_tests.run.linux.arm64.Release.mch ${\color{red}+0.03\%}$
realworld.run.linux.arm64.checked.mch ${\color{red}+0.03\%}$
coreclr_tests.run.linux.arm64.checked.mch ${\color{red}+0.05\%}$
benchmarks.run_pgo.linux.arm64.checked.mch ${\color{red}+0.03\%}$
libraries.pmi.linux.arm64.checked.mch ${\color{red}+0.03\%}$
benchmarks.run.linux.arm64.checked.mch ${\color{red}+0.05\%}$
benchmarks.run_tiered.linux.arm64.checked.mch ${\color{red}+0.02\%}$
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch ${\color{red}+0.05\%}$
smoke_tests.nativeaot.linux.arm64.checked.mch ${\color{red}+0.04\%}$

Throughput diffs for linux/x64 ran on linux/x64

Overall (${\color{green}-2.87\%}$ to ${\color{green}-0.15\%}$)
Collection PDIFF
benchmarks.run_pgo.linux.x64.checked.mch ${\color{green}-0.66\%}$
benchmarks.run_tiered.linux.x64.checked.mch ${\color{green}-2.87\%}$
libraries.crossgen2.linux.x64.checked.mch ${\color{green}-0.22\%}$
libraries.pmi.linux.x64.checked.mch ${\color{green}-0.17\%}$
libraries_tests_no_tiered_compilation.run.linux.x64.Release.mch ${\color{green}-0.34\%}$
realworld.run.linux.x64.checked.mch ${\color{green}-0.20\%}$
smoke_tests.nativeaot.linux.x64.checked.mch ${\color{green}-0.15\%}$
benchmarks.run.linux.x64.checked.mch ${\color{green}-0.20\%}$
libraries_tests.run.linux.x64.Release.mch ${\color{green}-1.45\%}$
coreclr_tests.run.linux.x64.checked.mch ${\color{green}-2.80\%}$
MinOpts (${\color{green}-6.69\%}$ to ${\color{green}-4.65\%}$)
Collection PDIFF
benchmarks.run_pgo.linux.x64.checked.mch ${\color{green}-5.91\%}$
benchmarks.run_tiered.linux.x64.checked.mch ${\color{green}-5.96\%}$
libraries.crossgen2.linux.x64.checked.mch ${\color{green}-4.86\%}$
libraries.pmi.linux.x64.checked.mch ${\color{green}-6.32\%}$
libraries_tests_no_tiered_compilation.run.linux.x64.Release.mch ${\color{green}-5.60\%}$
realworld.run.linux.x64.checked.mch ${\color{green}-6.69\%}$
smoke_tests.nativeaot.linux.x64.checked.mch ${\color{green}-4.65\%}$
benchmarks.run.linux.x64.checked.mch ${\color{green}-5.12\%}$
libraries_tests.run.linux.x64.Release.mch ${\color{green}-5.81\%}$
coreclr_tests.run.linux.x64.checked.mch ${\color{green}-6.04\%}$
FullOpts (${\color{green}-0.22\%}$ to ${\color{green}-0.12\%}$)
Collection PDIFF
benchmarks.run_pgo.linux.x64.checked.mch ${\color{green}-0.13\%}$
benchmarks.run_tiered.linux.x64.checked.mch ${\color{green}-0.12\%}$
libraries.crossgen2.linux.x64.checked.mch ${\color{green}-0.22\%}$
libraries.pmi.linux.x64.checked.mch ${\color{green}-0.16\%}$
libraries_tests_no_tiered_compilation.run.linux.x64.Release.mch ${\color{green}-0.19\%}$
realworld.run.linux.x64.checked.mch ${\color{green}-0.16\%}$
smoke_tests.nativeaot.linux.x64.checked.mch ${\color{green}-0.15\%}$
benchmarks.run.linux.x64.checked.mch ${\color{green}-0.17\%}$
libraries_tests.run.linux.x64.Release.mch ${\color{green}-0.14\%}$
coreclr_tests.run.linux.x64.checked.mch ${\color{green}-0.18\%}$

Details here

Throughput diffs for linux/arm ran on windows/x86

Overall (${\color{green}-4.15\%}$ to ${\color{green}-0.08\%}$)
Collection PDIFF
benchmarks.run.linux.arm.checked.mch ${\color{green}-0.18\%}$
benchmarks.run_pgo.linux.arm.checked.mch ${\color{green}-0.72\%}$
benchmarks.run_tiered.linux.arm.checked.mch ${\color{green}-2.03\%}$
coreclr_tests.run.linux.arm.checked.mch ${\color{green}-4.15\%}$
libraries.crossgen2.linux.arm.checked.mch ${\color{green}-0.15\%}$
libraries.pmi.linux.arm.checked.mch ${\color{green}-0.08\%}$
libraries_tests.run.linux.arm.Release.mch ${\color{green}-2.51\%}$
libraries_tests_no_tiered_compilation.run.linux.arm.Release.mch ${\color{green}-0.47\%}$
realworld.run.linux.arm.checked.mch ${\color{green}-0.15\%}$
MinOpts (${\color{green}-11.56\%}$ to ${\color{green}-9.30\%}$)
Collection PDIFF
benchmarks.run.linux.arm.checked.mch ${\color{green}-9.64\%}$
benchmarks.run_pgo.linux.arm.checked.mch ${\color{green}-10.51\%}$
benchmarks.run_tiered.linux.arm.checked.mch ${\color{green}-10.52\%}$
coreclr_tests.run.linux.arm.checked.mch ${\color{green}-9.88\%}$
libraries.crossgen2.linux.arm.checked.mch ${\color{green}-9.83\%}$
libraries.pmi.linux.arm.checked.mch ${\color{green}-9.30\%}$
libraries_tests.run.linux.arm.Release.mch ${\color{green}-10.61\%}$
libraries_tests_no_tiered_compilation.run.linux.arm.Release.mch ${\color{green}-9.92\%}$
realworld.run.linux.arm.checked.mch ${\color{green}-11.56\%}$
FullOpts (${\color{green}-0.15\%}$ to ${\color{green}-0.05\%}$)
Collection PDIFF
benchmarks.run.linux.arm.checked.mch ${\color{green}-0.09\%}$
benchmarks.run_pgo.linux.arm.checked.mch ${\color{green}-0.05\%}$
benchmarks.run_tiered.linux.arm.checked.mch ${\color{green}-0.09\%}$
coreclr_tests.run.linux.arm.checked.mch ${\color{green}-0.11\%}$
libraries.crossgen2.linux.arm.checked.mch ${\color{green}-0.15\%}$
libraries.pmi.linux.arm.checked.mch ${\color{green}-0.07\%}$
libraries_tests.run.linux.arm.Release.mch ${\color{green}-0.05\%}$
libraries_tests_no_tiered_compilation.run.linux.arm.Release.mch ${\color{green}-0.09\%}$
realworld.run.linux.arm.checked.mch ${\color{green}-0.06\%}$

Throughput diffs for windows/x86 ran on windows/x86

Overall (${\color{green}-1.68\%}$ to ${\color{red}+0.02\%}$)
Collection PDIFF
benchmarks.run.windows.x86.checked.mch ${\color{red}+0.02\%}$
benchmarks.run_pgo.windows.x86.checked.mch ${\color{green}-0.12\%}$
benchmarks.run_tiered.windows.x86.checked.mch ${\color{green}-0.59\%}$
coreclr_tests.run.windows.x86.checked.mch ${\color{green}-1.68\%}$
libraries.crossgen2.windows.x86.checked.mch ${\color{red}+0.01\%}$
libraries.pmi.windows.x86.checked.mch ${\color{red}+0.02\%}$
libraries_tests.run.windows.x86.Release.mch ${\color{green}-0.74\%}$
libraries_tests_no_tiered_compilation.run.windows.x86.Release.mch ${\color{green}-0.07\%}$
realworld.run.windows.x86.checked.mch ${\color{green}-0.01\%}$
MinOpts (${\color{green}-5.03\%}$ to ${\color{green}-3.05\%}$)
Collection PDIFF
benchmarks.run.windows.x86.checked.mch ${\color{green}-3.15\%}$
benchmarks.run_pgo.windows.x86.checked.mch ${\color{green}-3.99\%}$
benchmarks.run_tiered.windows.x86.checked.mch ${\color{green}-3.90\%}$
coreclr_tests.run.windows.x86.checked.mch ${\color{green}-5.03\%}$
libraries.crossgen2.windows.x86.checked.mch ${\color{green}-3.05\%}$
libraries.pmi.windows.x86.checked.mch ${\color{green}-4.50\%}$
libraries_tests.run.windows.x86.Release.mch ${\color{green}-4.24\%}$
libraries_tests_no_tiered_compilation.run.windows.x86.Release.mch ${\color{green}-3.69\%}$
realworld.run.windows.x86.checked.mch ${\color{green}-4.85\%}$
FullOpts (${\color{red}+0.01\%}$ to ${\color{red}+0.02\%}$)
Collection PDIFF
benchmarks.run.windows.x86.checked.mch ${\color{red}+0.02\%}$
benchmarks.run_pgo.windows.x86.checked.mch ${\color{red}+0.02\%}$
benchmarks.run_tiered.windows.x86.checked.mch ${\color{red}+0.02\%}$
coreclr_tests.run.windows.x86.checked.mch ${\color{red}+0.01\%}$
libraries.crossgen2.windows.x86.checked.mch ${\color{red}+0.01\%}$
libraries.pmi.windows.x86.checked.mch ${\color{red}+0.02\%}$
libraries_tests.run.windows.x86.Release.mch ${\color{red}+0.02\%}$
libraries_tests_no_tiered_compilation.run.windows.x86.Release.mch ${\color{red}+0.01\%}$
realworld.run.windows.x86.checked.mch ${\color{red}+0.02\%}$

Details here

@kunalspathak
Copy link
Member Author

@BruceForstall @jakobbotsch - can you please take another look? Thanks!

Comment on lines 5006 to 5009
#ifdef DEBUG
// Validate the current state just after we've freed the registers. This ensures that any pending
// freed registers will have had their state updated to reflect the intervals they were holding.
for (regNumber reg = REG_FIRST; reg < AVAILABLE_REG_COUNT; reg = REG_NEXT(reg))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible/straightforward to factor this validation into a method and use it from both allocateRegisters and allocateRegistersMinimal?

A related question -- what are the fundamental difference between the two, except the reduced set of heuristics used in the minimal version?

Copy link
Member

@jakobbotsch jakobbotsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've skimmed the code. I'm assuming allocateRegistersMinimal was copied mostly from allocateRegisters. I had a question on what the differences between them is, apart from the use of a reduced set of heuristics -- I'm assuming it wouldn't be straightforward to unify them to avoid code duplication? (For example, by the use of another template parameter)

The trade offs in CQ vs TP look just fine to me. We should definitely be taking this kind of tradeoff in MinOpts, and the TP improvements are amazing.

@kunalspathak
Copy link
Member Author

kunalspathak commented Jan 25, 2024

I'm assuming allocateRegistersMinimal was copied mostly from allocateRegisters.

Pretty much. To easily spot the difference, I extracted out here:

I'm assuming it wouldn't be straightforward to unify them to avoid code duplication? (For example, by the use of another template parameter)

Correct, that was the direction I started with, but I quickly noticed that it scatters the templated parameter everywhere and very hard to see what gets executed for MinOpts. In future, it will be good to focus on *minimal() APIs to further simplify the algorithm.

apart from the use of a reduced set of heuristics

Yes, that's the main change, and then removing the code for things that never happen for MinOpts, like we don't see RefTypeExpUse or RefTypeDummyDef.

@ryujit-bot
Copy link

Diff results for #96386

Assembly diffs

Assembly diffs for linux/arm64 ran on windows/x64

Diffs are based on 2,498,771 contexts (1,011,240 MinOpts, 1,487,531 FullOpts).

MISSED contexts: 6,580 (0.26%)

Overall (+147,632 bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.linux.arm64.checked.mch 15,908,516 +0
benchmarks.run_pgo.linux.arm64.checked.mch 78,672,796 +1,492
benchmarks.run_tiered.linux.arm64.checked.mch 29,592,944 +620
coreclr_tests.run.linux.arm64.checked.mch 508,895,360 +143,476
libraries.crossgen2.linux.arm64.checked.mch 55,781,064 +0
libraries.pmi.linux.arm64.checked.mch 74,608,176 +0
libraries_tests.run.linux.arm64.Release.mch 383,838,152 +1,936
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch 163,129,488 +108
realworld.run.linux.arm64.checked.mch 15,368,676 +0
MinOpts (+147,632 bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.linux.arm64.checked.mch 346,256 +0
benchmarks.run_pgo.linux.arm64.checked.mch 23,982,040 +1,492
benchmarks.run_tiered.linux.arm64.checked.mch 24,621,220 +620
coreclr_tests.run.linux.arm64.checked.mch 348,829,192 +143,476
libraries.crossgen2.linux.arm64.checked.mch 1,636 +0
libraries.pmi.linux.arm64.checked.mch 119,984 +0
libraries_tests.run.linux.arm64.Release.mch 215,421,676 +1,936
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch 13,479,996 +108
realworld.run.linux.arm64.checked.mch 574,152 +0

Assembly diffs for linux/x64 ran on windows/x64

Diffs are based on 2,505,340 contexts (977,766 MinOpts, 1,527,574 FullOpts).

MISSED contexts: 6,922 (0.28%)

Overall (-485,333 bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.linux.x64.checked.mch 16,148,228 -5
benchmarks.run_pgo.linux.x64.checked.mch 70,126,965 -18,248
benchmarks.run_tiered.linux.x64.checked.mch 15,918,176 -11,382
coreclr_tests.run.linux.x64.checked.mch 402,653,853 -237,129
libraries.crossgen2.linux.x64.checked.mch 38,689,357 -4
libraries.pmi.linux.x64.checked.mch 58,919,195 -13
libraries_tests.run.linux.x64.Release.mch 339,819,743 -163,282
libraries_tests_no_tiered_compilation.run.linux.x64.Release.mch 130,995,704 -54,244
realworld.run.linux.x64.checked.mch 12,767,093 -1,026
smoke_tests.nativeaot.linux.x64.checked.mch 4,192,959 +0
MinOpts (-485,333 bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.linux.x64.checked.mch 264,173 -5
benchmarks.run_pgo.linux.x64.checked.mch 21,310,577 -18,248
benchmarks.run_tiered.linux.x64.checked.mch 12,281,125 -11,382
coreclr_tests.run.linux.x64.checked.mch 279,538,531 -237,129
libraries.crossgen2.linux.x64.checked.mch 1,202 -4
libraries.pmi.linux.x64.checked.mch 112,870 -13
libraries_tests.run.linux.x64.Release.mch 183,511,120 -163,282
libraries_tests_no_tiered_compilation.run.linux.x64.Release.mch 10,658,448 -54,244
realworld.run.linux.x64.checked.mch 386,910 -1,026
smoke_tests.nativeaot.linux.x64.checked.mch 949 +0

Assembly diffs for osx/arm64 ran on windows/x64

Diffs are based on 2,229,922 contexts (927,360 MinOpts, 1,302,562 FullOpts).

MISSED contexts: 6,095 (0.27%)

Overall (+150,252 bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run_pgo.osx.arm64.checked.mch 34,339,712 +952
benchmarks.run_tiered.osx.arm64.checked.mch 15,547,160 +328
coreclr_tests.run.osx.arm64.checked.mch 483,275,704 +147,732
libraries.crossgen2.osx.arm64.checked.mch 55,660,852 +0
libraries.pmi.osx.arm64.checked.mch 78,453,800 +0
libraries_tests.run.osx.arm64.Release.mch 311,227,792 +1,132
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch 161,150,176 +108
realworld.run.osx.arm64.checked.mch 14,659,812 +0
MinOpts (+150,252 bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run_pgo.osx.arm64.checked.mch 16,407,348 +952
benchmarks.run_tiered.osx.arm64.checked.mch 11,499,248 +328
coreclr_tests.run.osx.arm64.checked.mch 331,292,952 +147,732
libraries.crossgen2.osx.arm64.checked.mch 1,628 +0
libraries.pmi.osx.arm64.checked.mch 121,128 +0
libraries_tests.run.osx.arm64.Release.mch 201,319,184 +1,132
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch 13,136,524 +108
realworld.run.osx.arm64.checked.mch 563,956 +0

Assembly diffs for windows/arm64 ran on windows/x64

Diffs are based on 2,308,445 contexts (929,692 MinOpts, 1,378,753 FullOpts).

MISSED contexts: 6,353 (0.27%)

Overall (+148,112 bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run_pgo.windows.arm64.checked.mch 46,475,100 +988
benchmarks.run_tiered.windows.arm64.checked.mch 15,251,468 +384
coreclr_tests.run.windows.arm64.checked.mch 494,449,448 +143,628
libraries.crossgen2.windows.arm64.checked.mch 59,008,088 +0
libraries.pmi.windows.arm64.checked.mch 78,059,240 +0
libraries_tests.run.windows.arm64.Release.mch 308,519,220 +3,000
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch 169,413,408 +112
realworld.run.windows.arm64.checked.mch 15,463,108 +0
MinOpts (+148,112 bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run_pgo.windows.arm64.checked.mch 16,058,416 +988
benchmarks.run_tiered.windows.arm64.checked.mch 11,172,728 +384
coreclr_tests.run.windows.arm64.checked.mch 338,587,172 +143,628
libraries.crossgen2.windows.arm64.checked.mch 1,636 +0
libraries.pmi.windows.arm64.checked.mch 119,984 +0
libraries_tests.run.windows.arm64.Release.mch 201,005,952 +3,000
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch 13,136,540 +112
realworld.run.windows.arm64.checked.mch 563,980 +0

Assembly diffs for windows/x64 ran on windows/x64

Diffs are based on 2,366,385 contexts (928,740 MinOpts, 1,437,645 FullOpts).

MISSED contexts: 6,816 (0.29%)

Overall (-62,154 bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.windows.x64.checked.mch 8,536,477 +2
benchmarks.run_pgo.windows.x64.checked.mch 35,464,350 -13,500
benchmarks.run_tiered.windows.x64.checked.mch 12,551,598 -12,300
coreclr_tests.run.windows.x64.checked.mch 392,239,828 +139,386
libraries.crossgen2.windows.x64.checked.mch 39,446,742 -2
libraries.pmi.windows.x64.checked.mch 60,117,403 +2
libraries_tests.run.windows.x64.Release.mch 276,943,385 -163,110
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch 135,762,558 -12,638
realworld.run.windows.x64.checked.mch 13,729,153 +6
smoke_tests.nativeaot.windows.x64.checked.mch 5,087,013 +0
MinOpts (-62,154 bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.windows.x64.checked.mch 361 +2
benchmarks.run_pgo.windows.x64.checked.mch 13,997,605 -13,500
benchmarks.run_tiered.windows.x64.checked.mch 9,105,849 -12,300
coreclr_tests.run.windows.x64.checked.mch 272,531,567 +139,386
libraries.crossgen2.windows.x64.checked.mch 1,189 -2
libraries.pmi.windows.x64.checked.mch 113,519 +2
libraries_tests.run.windows.x64.Release.mch 173,720,676 -163,110
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch 10,806,556 -12,638
realworld.run.windows.x64.checked.mch 386,603 +6
smoke_tests.nativeaot.windows.x64.checked.mch 947 +0

Details here


Assembly diffs for linux/arm ran on windows/x86

Diffs are based on 2,230,528 contexts (825,130 MinOpts, 1,405,398 FullOpts).

MISSED contexts: 77,529 (3.36%)

Overall (-868,478 bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.linux.arm.checked.mch 15,056,962 -54
benchmarks.run_pgo.linux.arm.checked.mch 63,327,810 -49,248
benchmarks.run_tiered.linux.arm.checked.mch 17,395,270 -32,052
coreclr_tests.run.linux.arm.checked.mch 321,199,404 -200,016
libraries.crossgen2.linux.arm.checked.mch 36,758,116 +0
libraries.pmi.linux.arm.checked.mch 48,738,124 -280
libraries_tests.run.linux.arm.Release.mch 244,888,660 -547,278
libraries_tests_no_tiered_compilation.run.linux.arm.Release.mch 93,389,302 -24,904
realworld.run.linux.arm.checked.mch 13,304,600 -14,646
MinOpts (-868,478 bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.linux.arm.checked.mch 383,466 -54
benchmarks.run_pgo.linux.arm.checked.mch 11,641,744 -49,248
benchmarks.run_tiered.linux.arm.checked.mch 7,192,228 -32,052
coreclr_tests.run.linux.arm.checked.mch 212,641,656 -200,016
libraries.crossgen2.linux.arm.checked.mch 1,230 +0
libraries.pmi.linux.arm.checked.mch 106,504 -280
libraries_tests.run.linux.arm.Release.mch 122,290,786 -547,278
libraries_tests_no_tiered_compilation.run.linux.arm.Release.mch 10,083,802 -24,904
realworld.run.linux.arm.checked.mch 449,686 -14,646

Assembly diffs for windows/x86 ran on windows/x86

Diffs are based on 2,246,531 contexts (794,865 MinOpts, 1,451,666 FullOpts).

MISSED contexts: base: 7,010 (0.30%), diff: 52,597 (2.29%)

Overall (+190,955 bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.windows.x86.checked.mch 6,960,783 +2
benchmarks.run_pgo.windows.x86.checked.mch 44,966,304 +504
benchmarks.run_tiered.windows.x86.checked.mch 8,898,260 +195
coreclr_tests.run.windows.x86.checked.mch 305,383,476 +201,068
libraries.crossgen2.windows.x86.checked.mch 31,679,639 +3
libraries.pmi.windows.x86.checked.mch 48,041,556 -81
libraries_tests.run.windows.x86.Release.mch 167,902,249 -13,935
libraries_tests_no_tiered_compilation.run.windows.x86.Release.mch 101,249,179 +3,185
realworld.run.windows.x86.checked.mch 11,011,414 +14
MinOpts (+190,955 bytes)
Collection Base size (bytes) Diff size (bytes)
benchmarks.run.windows.x86.checked.mch 279 +2
benchmarks.run_pgo.windows.x86.checked.mch 5,930,678 +504
benchmarks.run_tiered.windows.x86.checked.mch 3,850,704 +195
coreclr_tests.run.windows.x86.checked.mch 198,326,162 +201,068
libraries.crossgen2.windows.x86.checked.mch 1,057 +3
libraries.pmi.windows.x86.checked.mch 95,314 -81
libraries_tests.run.windows.x86.Release.mch 78,712,933 -13,935
libraries_tests_no_tiered_compilation.run.windows.x86.Release.mch 7,275,039 +3,185
realworld.run.windows.x86.checked.mch 295,700 +14

Details here


Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

Overall (-5.32% to -0.04%)
Collection PDIFF
benchmarks.run.linux.arm64.checked.mch -0.11%
benchmarks.run_pgo.linux.arm64.checked.mch -1.07%
benchmarks.run_tiered.linux.arm64.checked.mch -5.32%
coreclr_tests.run.linux.arm64.checked.mch -3.98%
libraries.crossgen2.linux.arm64.checked.mch -0.07%
libraries.pmi.linux.arm64.checked.mch -0.05%
libraries_tests.run.linux.arm64.Release.mch -2.35%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch -0.29%
realworld.run.linux.arm64.checked.mch -0.13%
smoke_tests.nativeaot.linux.arm64.checked.mch -0.04%
MinOpts (-10.39% to -6.49%)
Collection PDIFF
benchmarks.run.linux.arm64.checked.mch -8.36%
benchmarks.run_pgo.linux.arm64.checked.mch -8.98%
benchmarks.run_tiered.linux.arm64.checked.mch -9.07%
coreclr_tests.run.linux.arm64.checked.mch -9.16%
libraries.crossgen2.linux.arm64.checked.mch -8.01%
libraries.pmi.linux.arm64.checked.mch -9.80%
libraries_tests.run.linux.arm64.Release.mch -8.97%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch -8.60%
realworld.run.linux.arm64.checked.mch -10.39%
smoke_tests.nativeaot.linux.arm64.checked.mch -6.49%
FullOpts (-0.07% to -0.03%)
Collection PDIFF
benchmarks.run.linux.arm64.checked.mch -0.05%
benchmarks.run_pgo.linux.arm64.checked.mch -0.04%
benchmarks.run_tiered.linux.arm64.checked.mch -0.03%
coreclr_tests.run.linux.arm64.checked.mch -0.05%
libraries.crossgen2.linux.arm64.checked.mch -0.07%
libraries.pmi.linux.arm64.checked.mch -0.04%
libraries_tests.run.linux.arm64.Release.mch -0.04%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch -0.05%
realworld.run.linux.arm64.checked.mch -0.04%
smoke_tests.nativeaot.linux.arm64.checked.mch -0.04%

Throughput diffs for linux/x64 ran on windows/x64

Overall (-2.61% to -0.05%)
Collection PDIFF
benchmarks.run.linux.x64.checked.mch -0.09%
benchmarks.run_pgo.linux.x64.checked.mch -0.61%
benchmarks.run_tiered.linux.x64.checked.mch -2.55%
coreclr_tests.run.linux.x64.checked.mch -2.61%
libraries.crossgen2.linux.x64.checked.mch -0.08%
libraries.pmi.linux.x64.checked.mch -0.06%
libraries_tests.run.linux.x64.Release.mch -1.28%
libraries_tests_no_tiered_compilation.run.linux.x64.Release.mch -0.20%
realworld.run.linux.x64.checked.mch -0.10%
smoke_tests.nativeaot.linux.x64.checked.mch -0.05%
MinOpts (-6.57% to -4.49%)
Collection PDIFF
benchmarks.run.linux.x64.checked.mch -5.03%
benchmarks.run_pgo.linux.x64.checked.mch -5.61%
benchmarks.run_tiered.linux.x64.checked.mch -5.72%
coreclr_tests.run.linux.x64.checked.mch -6.14%
libraries.crossgen2.linux.x64.checked.mch -4.49%
libraries.pmi.linux.x64.checked.mch -5.89%
libraries_tests.run.linux.x64.Release.mch -5.66%
libraries_tests_no_tiered_compilation.run.linux.x64.Release.mch -5.15%
realworld.run.linux.x64.checked.mch -6.57%
smoke_tests.nativeaot.linux.x64.checked.mch -4.76%
FullOpts (-0.08% to -0.04%)
Collection PDIFF
benchmarks.run.linux.x64.checked.mch -0.06%
benchmarks.run_pgo.linux.x64.checked.mch -0.04%
benchmarks.run_tiered.linux.x64.checked.mch -0.04%
coreclr_tests.run.linux.x64.checked.mch -0.06%
libraries.crossgen2.linux.x64.checked.mch -0.08%
libraries.pmi.linux.x64.checked.mch -0.06%
libraries_tests.run.linux.x64.Release.mch -0.05%
libraries_tests_no_tiered_compilation.run.linux.x64.Release.mch -0.07%
realworld.run.linux.x64.checked.mch -0.06%
smoke_tests.nativeaot.linux.x64.checked.mch -0.05%

Throughput diffs for osx/arm64 ran on windows/x64

Overall (-4.23% to -0.05%)
Collection PDIFF
benchmarks.run_pgo.osx.arm64.checked.mch -1.91%
benchmarks.run_tiered.osx.arm64.checked.mch -4.23%
coreclr_tests.run.osx.arm64.checked.mch -3.94%
libraries.crossgen2.osx.arm64.checked.mch -0.07%
libraries.pmi.osx.arm64.checked.mch -0.05%
libraries_tests.run.osx.arm64.Release.mch -3.02%
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch -0.29%
realworld.run.osx.arm64.checked.mch -0.14%
MinOpts (-10.43% to -7.92%)
Collection PDIFF
benchmarks.run_pgo.osx.arm64.checked.mch -9.25%
benchmarks.run_tiered.osx.arm64.checked.mch -9.54%
coreclr_tests.run.osx.arm64.checked.mch -9.03%
libraries.crossgen2.osx.arm64.checked.mch -7.92%
libraries.pmi.osx.arm64.checked.mch -9.79%
libraries_tests.run.osx.arm64.Release.mch -9.00%
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch -8.60%
realworld.run.osx.arm64.checked.mch -10.43%
FullOpts (-0.07% to -0.03%)
Collection PDIFF
benchmarks.run_pgo.osx.arm64.checked.mch -0.03%
benchmarks.run_tiered.osx.arm64.checked.mch -0.03%
coreclr_tests.run.osx.arm64.checked.mch -0.06%
libraries.crossgen2.osx.arm64.checked.mch -0.07%
libraries.pmi.osx.arm64.checked.mch -0.04%
libraries_tests.run.osx.arm64.Release.mch -0.04%
libraries_tests_no_tiered_compilation.run.osx.arm64.Release.mch -0.05%
realworld.run.osx.arm64.checked.mch -0.04%

Throughput diffs for windows/arm64 ran on windows/x64

Overall (-4.15% to -0.04%)
Collection PDIFF
benchmarks.run.windows.arm64.checked.mch -0.04%
benchmarks.run_pgo.windows.arm64.checked.mch -1.23%
benchmarks.run_tiered.windows.arm64.checked.mch -4.15%
coreclr_tests.run.windows.arm64.checked.mch -3.98%
libraries.crossgen2.windows.arm64.checked.mch -0.07%
libraries.pmi.windows.arm64.checked.mch -0.05%
libraries_tests.run.windows.arm64.Release.mch -3.07%
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch -0.28%
realworld.run.windows.arm64.checked.mch -0.13%
smoke_tests.nativeaot.windows.arm64.checked.mch -0.04%
MinOpts (-10.42% to -6.32%)
Collection PDIFF
benchmarks.run.windows.arm64.checked.mch -8.96%
benchmarks.run_pgo.windows.arm64.checked.mch -9.17%
benchmarks.run_tiered.windows.arm64.checked.mch -9.48%
coreclr_tests.run.windows.arm64.checked.mch -9.16%
libraries.crossgen2.windows.arm64.checked.mch -8.01%
libraries.pmi.windows.arm64.checked.mch -9.80%
libraries_tests.run.windows.arm64.Release.mch -8.99%
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch -8.60%
realworld.run.windows.arm64.checked.mch -10.42%
smoke_tests.nativeaot.windows.arm64.checked.mch -6.32%
FullOpts (-0.07% to -0.03%)
Collection PDIFF
benchmarks.run.windows.arm64.checked.mch -0.04%
benchmarks.run_pgo.windows.arm64.checked.mch -0.03%
benchmarks.run_tiered.windows.arm64.checked.mch -0.03%
coreclr_tests.run.windows.arm64.checked.mch -0.05%
libraries.crossgen2.windows.arm64.checked.mch -0.07%
libraries.pmi.windows.arm64.checked.mch -0.04%
libraries_tests.run.windows.arm64.Release.mch -0.04%
libraries_tests_no_tiered_compilation.run.windows.arm64.Release.mch -0.05%
realworld.run.windows.arm64.checked.mch -0.04%
smoke_tests.nativeaot.windows.arm64.checked.mch -0.04%

Throughput diffs for windows/x64 ran on windows/x64

Overall (-2.54% to -0.08%)
Collection PDIFF
benchmarks.run.windows.x64.checked.mch -0.08%
benchmarks.run_pgo.windows.x64.checked.mch -0.83%
benchmarks.run_tiered.windows.x64.checked.mch -2.24%
coreclr_tests.run.windows.x64.checked.mch -2.54%
libraries.crossgen2.windows.x64.checked.mch -0.12%
libraries.pmi.windows.x64.checked.mch -0.10%
libraries_tests.run.windows.x64.Release.mch -1.66%
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch -0.24%
realworld.run.windows.x64.checked.mch -0.14%
smoke_tests.nativeaot.windows.x64.checked.mch -0.08%
MinOpts (-6.48% to -4.45%)
Collection PDIFF
benchmarks.run.windows.x64.checked.mch -4.89%
benchmarks.run_pgo.windows.x64.checked.mch -5.70%
benchmarks.run_tiered.windows.x64.checked.mch -5.72%
coreclr_tests.run.windows.x64.checked.mch -5.96%
libraries.crossgen2.windows.x64.checked.mch -4.45%
libraries.pmi.windows.x64.checked.mch -5.72%
libraries_tests.run.windows.x64.Release.mch -5.73%
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch -5.56%
realworld.run.windows.x64.checked.mch -6.48%
smoke_tests.nativeaot.windows.x64.checked.mch -4.69%
FullOpts (-0.12% to -0.07%)
Collection PDIFF
benchmarks.run.windows.x64.checked.mch -0.08%
benchmarks.run_pgo.windows.x64.checked.mch -0.07%
benchmarks.run_tiered.windows.x64.checked.mch -0.07%
coreclr_tests.run.windows.x64.checked.mch -0.11%
libraries.crossgen2.windows.x64.checked.mch -0.12%
libraries.pmi.windows.x64.checked.mch -0.09%
libraries_tests.run.windows.x64.Release.mch -0.08%
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch -0.11%
realworld.run.windows.x64.checked.mch -0.10%
smoke_tests.nativeaot.windows.x64.checked.mch -0.08%

Details here


Throughput diffs for linux/arm ran on windows/x86

Overall (-4.16% to -0.08%)
Collection PDIFF
benchmarks.run.linux.arm.checked.mch -0.20%
benchmarks.run_pgo.linux.arm.checked.mch -0.71%
benchmarks.run_tiered.linux.arm.checked.mch -2.03%
coreclr_tests.run.linux.arm.checked.mch -4.16%
libraries.crossgen2.linux.arm.checked.mch -0.15%
libraries.pmi.linux.arm.checked.mch -0.08%
libraries_tests.run.linux.arm.Release.mch -2.45%
libraries_tests_no_tiered_compilation.run.linux.arm.Release.mch -0.46%
realworld.run.linux.arm.checked.mch -0.15%
MinOpts (-11.57% to -9.31%)
Collection PDIFF
benchmarks.run.linux.arm.checked.mch -9.63%
benchmarks.run_pgo.linux.arm.checked.mch -10.51%
benchmarks.run_tiered.linux.arm.checked.mch -10.53%
coreclr_tests.run.linux.arm.checked.mch -9.89%
libraries.crossgen2.linux.arm.checked.mch -9.84%
libraries.pmi.linux.arm.checked.mch -9.31%
libraries_tests.run.linux.arm.Release.mch -10.61%
libraries_tests_no_tiered_compilation.run.linux.arm.Release.mch -9.93%
realworld.run.linux.arm.checked.mch -11.57%
FullOpts (-0.15% to -0.05%)
Collection PDIFF
benchmarks.run.linux.arm.checked.mch -0.10%
benchmarks.run_pgo.linux.arm.checked.mch -0.06%
benchmarks.run_tiered.linux.arm.checked.mch -0.09%
coreclr_tests.run.linux.arm.checked.mch -0.11%
libraries.crossgen2.linux.arm.checked.mch -0.15%
libraries.pmi.linux.arm.checked.mch -0.07%
libraries_tests.run.linux.arm.Release.mch -0.05%
libraries_tests_no_tiered_compilation.run.linux.arm.Release.mch -0.09%
realworld.run.linux.arm.checked.mch -0.06%

Throughput diffs for windows/x86 ran on windows/x86

Overall (-1.69% to +0.02%)
Collection PDIFF
benchmarks.run.windows.x86.checked.mch +0.02%
benchmarks.run_pgo.windows.x86.checked.mch -0.12%
benchmarks.run_tiered.windows.x86.checked.mch -0.60%
coreclr_tests.run.windows.x86.checked.mch -1.69%
libraries.crossgen2.windows.x86.checked.mch +0.01%
libraries.pmi.windows.x86.checked.mch +0.02%
libraries_tests.run.windows.x86.Release.mch -0.73%
libraries_tests_no_tiered_compilation.run.windows.x86.Release.mch -0.06%
realworld.run.windows.x86.checked.mch -0.01%
MinOpts (-5.04% to -3.05%)
Collection PDIFF
benchmarks.run.windows.x86.checked.mch -3.15%
benchmarks.run_pgo.windows.x86.checked.mch -3.99%
benchmarks.run_tiered.windows.x86.checked.mch -3.90%
coreclr_tests.run.windows.x86.checked.mch -5.04%
libraries.crossgen2.windows.x86.checked.mch -3.05%
libraries.pmi.windows.x86.checked.mch -4.50%
libraries_tests.run.windows.x86.Release.mch -4.25%
libraries_tests_no_tiered_compilation.run.windows.x86.Release.mch -3.69%
realworld.run.windows.x86.checked.mch -4.85%
FullOpts (+0.01% to +0.02%)
Collection PDIFF
benchmarks.run.windows.x86.checked.mch +0.02%
benchmarks.run_pgo.windows.x86.checked.mch +0.02%
benchmarks.run_tiered.windows.x86.checked.mch +0.02%
coreclr_tests.run.windows.x86.checked.mch +0.01%
libraries.crossgen2.windows.x86.checked.mch +0.01%
libraries.pmi.windows.x86.checked.mch +0.02%
libraries_tests.run.windows.x86.Release.mch +0.01%
libraries_tests_no_tiered_compilation.run.windows.x86.Release.mch +0.01%
realworld.run.windows.x86.checked.mch +0.02%

Details here


Throughput diffs for linux/arm64 ran on linux/x64

Overall (-5.82% to +0.04%)
Collection PDIFF
smoke_tests.nativeaot.linux.arm64.checked.mch +0.04%
benchmarks.run_tiered.linux.arm64.checked.mch -5.82%
benchmarks.run_pgo.linux.arm64.checked.mch -1.12%
libraries.crossgen2.linux.arm64.checked.mch +0.04%
libraries.pmi.linux.arm64.checked.mch +0.03%
realworld.run.linux.arm64.checked.mch -0.07%
libraries_tests.run.linux.arm64.Release.mch -2.55%
benchmarks.run.linux.arm64.checked.mch -0.01%
coreclr_tests.run.linux.arm64.checked.mch -4.32%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch -0.24%
MinOpts (-11.62% to -6.54%)
Collection PDIFF
smoke_tests.nativeaot.linux.arm64.checked.mch -6.54%
benchmarks.run_tiered.linux.arm64.checked.mch -10.03%
benchmarks.run_pgo.linux.arm64.checked.mch -10.07%
libraries.crossgen2.linux.arm64.checked.mch -9.09%
libraries.pmi.linux.arm64.checked.mch -11.21%
realworld.run.linux.arm64.checked.mch -11.62%
libraries_tests.run.linux.arm64.Release.mch -9.97%
benchmarks.run.linux.arm64.checked.mch -9.19%
coreclr_tests.run.linux.arm64.checked.mch -9.71%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch -10.06%
FullOpts (+0.02% to +0.06%)
Collection PDIFF
smoke_tests.nativeaot.linux.arm64.checked.mch +0.04%
benchmarks.run_tiered.linux.arm64.checked.mch +0.02%
benchmarks.run_pgo.linux.arm64.checked.mch +0.03%
libraries.crossgen2.linux.arm64.checked.mch +0.04%
libraries.pmi.linux.arm64.checked.mch +0.03%
realworld.run.linux.arm64.checked.mch +0.03%
libraries_tests.run.linux.arm64.Release.mch +0.03%
benchmarks.run.linux.arm64.checked.mch +0.06%
coreclr_tests.run.linux.arm64.checked.mch +0.05%
libraries_tests_no_tiered_compilation.run.linux.arm64.Release.mch +0.04%

Throughput diffs for linux/x64 ran on linux/x64

Overall (-2.72% to -0.15%)
Collection PDIFF
benchmarks.run.linux.x64.checked.mch -0.20%
libraries.pmi.linux.x64.checked.mch -0.17%
coreclr_tests.run.linux.x64.checked.mch -2.70%
libraries_tests.run.linux.x64.Release.mch -1.39%
benchmarks.run_pgo.linux.x64.checked.mch -0.71%
libraries.crossgen2.linux.x64.checked.mch -0.22%
libraries_tests_no_tiered_compilation.run.linux.x64.Release.mch -0.33%
realworld.run.linux.x64.checked.mch -0.20%
smoke_tests.nativeaot.linux.x64.checked.mch -0.15%
benchmarks.run_tiered.linux.x64.checked.mch -2.72%
MinOpts (-6.70% to -4.66%)
Collection PDIFF
benchmarks.run.linux.x64.checked.mch -5.13%
libraries.pmi.linux.x64.checked.mch -6.33%
coreclr_tests.run.linux.x64.checked.mch -5.89%
libraries_tests.run.linux.x64.Release.mch -5.81%
benchmarks.run_pgo.linux.x64.checked.mch -5.87%
libraries.crossgen2.linux.x64.checked.mch -4.86%
libraries_tests_no_tiered_compilation.run.linux.x64.Release.mch -5.60%
realworld.run.linux.x64.checked.mch -6.70%
smoke_tests.nativeaot.linux.x64.checked.mch -4.66%
benchmarks.run_tiered.linux.x64.checked.mch -6.00%
FullOpts (-0.22% to -0.12%)
Collection PDIFF
benchmarks.run.linux.x64.checked.mch -0.17%
libraries.pmi.linux.x64.checked.mch -0.16%
coreclr_tests.run.linux.x64.checked.mch -0.19%
libraries_tests.run.linux.x64.Release.mch -0.13%
benchmarks.run_pgo.linux.x64.checked.mch -0.13%
libraries.crossgen2.linux.x64.checked.mch -0.22%
libraries_tests_no_tiered_compilation.run.linux.x64.Release.mch -0.19%
realworld.run.linux.x64.checked.mch -0.16%
smoke_tests.nativeaot.linux.x64.checked.mch -0.15%
benchmarks.run_tiered.linux.x64.checked.mch -0.12%

Details here


@kunalspathak kunalspathak merged commit 468447b into dotnet:main Jan 26, 2024
129 checks passed
@kunalspathak kunalspathak deleted the minopts-lsra branch January 26, 2024 15:39
@github-actions github-actions bot locked and limited conversation to collaborators Feb 26, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants