Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unoptimal codegen for Vector<T> init if Vector is not a localVar. #33617

Closed
sandreenko opened this issue Mar 16, 2020 · 5 comments
Closed

Unoptimal codegen for Vector<T> init if Vector is not a localVar. #33617

sandreenko opened this issue Mar 16, 2020 · 5 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions
Milestone

Comments

@sandreenko
Copy link
Contributor

sandreenko commented Mar 16, 2020

For a method like this:

    struct StructWithVectorField
    {
        public int a;
        public Vector<float> b;
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    static void TestSIMDInit()
    {
        Vector<float> localVector = new Vector<float>();
        Console.WriteLine(localVector.ToString());
        StructWithVectorField structWithVectorField;
        structWithVectorField.a = 0;
        structWithVectorField.b = new Vector<float>();
        Console.WriteLine(structWithVectorField.b.ToString());
        Console.WriteLine(Vector<float>.Count);
    }

IL for these two Vector inits will look similar:

Vector<float> localVector = new Vector<float>();
->
IL_0000  12 00             ldloca.s     0x0
IL_0002  fe 15 01 00 00 1b initobj      0x1B000001 
 structWithVectorField.b = new Vector<float>();
->
IL_0024  7c 14 00 00 04    ldflda       0x4000014
IL_0029  fe 15 01 00 00 1b initobj      0x1B000001

but the generated code will be different:

1.
IN0001:        vxorps   ymm0, ymm0
IN0002:        vmovupd  ymmword ptr[V00 rsp+50H], ymm0
2.
IN0009:        vxorps   xmm0, xmm0
IN000a:        vmovdqu  xmmword ptr [V01+0x8 rsp+30H], xmm0
IN000b:        vmovdqu  xmmword ptr [V01+0x18 rsp+40H], xmm0

Of course, the first version is better (10 bytes against 16).

The problem happens because fgMorphOneAsgBlockOp doesn't do that optimization for LCL_FLD nodes, only for LCL_VAR.

fieldUnoptimalVectorT.txt

That will be probably fixed during #1231.

Note: that is why we need to pass to isBlkReqd == true to fgMorphBlockOperand for gtType != TYP_STRUCT.

The fix will cause diffs in ~10 pri1 tests like:

System.Linq.Set`1[Vector`1][System.Numerics.Vector`1[System.Single]]:Remove(System.Numerics.Vector`1[Single]):bool:this
System.Security.Principal.WindowsIdentity:RunImpersonated(Microsoft.Win32.SafeHandles.SafeAccessTokenHandle,System.Func`1[Vector`1]):System.Numerics.Vector`1[Single] (MethodHash=ffeb0926)
System.Threading.Tasks.RendezvousAwaitable`1[Vector`1][System.Numerics.Vector`1[System.Single]]:GetResult():System.Numerics.Vector`1[Single]:this (MethodHash=9b817fd3)

etc.

category:cq
theme:vector-codegen
skill-level:intermediate
cost:medium

@sandreenko sandreenko added enhancement Product code improvement that does NOT require public API changes/additions area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Mar 16, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Mar 16, 2020
@sandreenko
Copy link
Contributor Author

cc @CarolEidt That is part of what we have discussed last week, I am still preparing the main issue description.

@BruceForstall BruceForstall added this to the 5.0 milestone Apr 4, 2020
@BruceForstall BruceForstall removed the untriaged New issue has not been triaged by the area owner label Apr 4, 2020
@AndyAyersMS
Copy link
Member

@sandreenko is there work here beyond fixing #1231?

@sandreenko
Copy link
Contributor Author

It is not part #1231, but after #1231 we will have more LCL_VAR and less LCL_FLD so the number of cases where we see this issue will be smaller.

But the real fix for that was in #33665, but I got a push back and was requested to do work that I do not have time to do currently.

@AndyAyersMS
Copy link
Member

Ok, let's mark this as future.

@AndyAyersMS AndyAyersMS modified the milestones: 5.0, Future May 1, 2020
@BruceForstall BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020
@sandreenko sandreenko removed the JitUntriaged CLR JIT issues needing additional triage label Jan 4, 2021
@sandreenko
Copy link
Contributor Author

Fixed by #55604

@ghost ghost locked as resolved and limited conversation to collaborators Nov 3, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants