-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zero-extend int #61
Zero-extend int #61
Conversation
Following Mendel's link to "Peanut butter" https://devblogs.microsoft.com/dotnet/performance-improvements-in-net-6/#peanut-butter. This gives 2-3% on my machine on the same HT core. The bench with different physical cores became very unstable on my machine: jumping from 1xx to 280 (compared to my previous PR, which works as it did on the same machine). Adding `ExtraWork` method, which was removed for some reason, every time is tedious so I have not, but it makes batching behavior more stable. So these numbers need to be double checked on different machines.
Yes, well spotted :) The associated PR is dotnet/runtime#51190 for reference's sake. |
@@ -58,10 +58,13 @@ public static T Read<T>(object array, int index) | |||
Ldloc_0(); // load the object pointer as a byref | |||
|
|||
Ldarg(nameof(index)); | |||
Conv_U(); // zero extend | |||
|
|||
Sizeof(typeof(object)); | |||
Mul(); // index x sizeof(object) | |||
|
|||
Call(MethodRef.PropertyGet(typeof(InternalUtil), nameof(ArrayDataOffset))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could avoid zero extend at all by storing ArrayDataOffset
as nint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would make the code easier to understand as well and is exactly what nint
was intended for
Thank you for spotting this.
Not a huge win but this method method was already quite optimized. I will add back |
Adding |
I get the same perf improvements using I will accept the PR and then apply the same trick to other |
The change after adding the second Conv_U for
Go ahead! |
Just in time for V5.0! |
Following Mendel's link to "Peanut butter" https://devblogs.microsoft.com/dotnet/performance-improvements-in-net-6/#peanut-butter.
This gives 2-3% on my machine on the same HT core. The bench with different physical cores became very unstable on my machine: jumping from 1xx to 280 (compared to my previous PR, which works as it did on the same machine). This is not related to this change, it jumps before the change as well.
Adding
ExtraWork
method, which was removed for some reason, every time is tedious so I have not, but it makes batching behavior more stable. So these numbers need to be double checked on different machines.