Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crossgen2 failure in VMR when using VMR stage 2 build scenario #95768

Closed
mthalman opened this issue Dec 7, 2023 · 4 comments · Fixed by dotnet/sdk#37407
Closed

crossgen2 failure in VMR when using VMR stage 2 build scenario #95768

mthalman opened this issue Dec 7, 2023 · 4 comments · Fixed by dotnet/sdk#37407

Comments

@mthalman
Copy link
Member

mthalman commented Dec 7, 2023

Source build has a required scenario that involves building the VMR once to produce the .NET SDK. And then rebuilding the VMR again using the VMR that was just built. That scenario fails during the build of the runtime repo with the following error:

EXEC error message: An attempt was made to load a program with an incorrect format. [/repos/dotnet/src/runtime/artifacts/source-build/self/src/src/installer/pkg/sfx/Microsoft.NETCore.App/Microsoft.NETCore.App.Crossgen2.sfxproj]

The full context of that MSBuild call is here:

Exec
    Assembly = Microsoft.Build.Tasks.Core, Version=15.1.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a
    Parameters
        Command = /repos/dotnet/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.38-x64/publish/crossgen2 /repos/dotnet/src/runtime/artifacts/source-build/self/src/artifacts/bin/coreclr/linux.x64.Release/IL/System.Private.CoreLib.dll --out /repos/dotnet/src/runtime/artifacts/source-build/self/src/artifacts/obj/Microsoft.NETCore.App.Crossgen2/Release/net9.0/fedora.38-x64/S.P.C.tmp
    CommandLineArguments = /repos/dotnet/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.38-x64/publish/crossgen2 /repos/dotnet/src/runtime/artifacts/source-build/self/src/artifacts/bin/coreclr/linux.x64.Release/IL/System.Private.CoreLib.dll --out /repos/dotnet/src/runtime/artifacts/source-build/self/src/artifacts/obj/Microsoft.NETCore.App.Crossgen2/Release/net9.0/fedora.38-x64/S.P.C.tmp
    Failed to load System.Private.CoreLib.dll (error code 0x8007000B)
    Path: /repos/dotnet/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.38-x64/publish/System.Private.CoreLib.dll
    Errors
        EXEC error message: An attempt was made to load a program with an incorrect format. [/repos/dotnet/src/runtime/artifacts/source-build/self/src/src/installer/pkg/sfx/Microsoft.NETCore.App/Microsoft.NETCore.App.Crossgen2.sfxproj]
        /repos/dotnet/src/runtime/artifacts/source-build/self/src/src/installer/pkg/sfx/Microsoft.NETCore.App/Microsoft.NETCore.App.Crossgen2.sfxproj(67,5): error MSB3073: The command "/repos/dotnet/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.38-x64/publish/crossgen2 /repos/dotnet/src/runtime/artifacts/source-build/self/src/artifacts/bin/coreclr/linux.x64.Release/IL/System.Private.CoreLib.dll --out /repos/dotnet/src/runtime/artifacts/source-build/self/src/artifacts/obj/Microsoft.NETCore.App.Crossgen2/Release/net9.0/fedora.38-x64/S.P.C.tmp" exited with code 137. [/repos/dotnet/src/runtime/artifacts/source-build/self/src/src/installer/pkg/sfx/Microsoft.NETCore.App/Microsoft.NETCore.App.Crossgen2.sfxproj]
     (0x8007000B)
    Failed to create CoreCLR, HRESULT: 0x8007000B
    OutputProperties
        CrossgenExitCode = 137

Note that the output mentions this path, which does not exist on disk: /repos/dotnet/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.38-x64/publish/System.Private.CoreLib.dll

This seems to be related to the changes in #92677. /cc @jkoritzinsky

This does require the changes in dotnet/installer#17929 in order to reach this point of the VMR build.

Repro Steps

  1. Check out the main branch of the VMR
  2. Manually apply the changes from Add runtime patch to use net9.0 TFM installer#17929 if they don't already exist.
  3. git add .
  4. mkdir /repos/bootstrap
  5. ./prep.sh
  6. ./build.sh --clean-while-building --online
  7. cp artifacts/x64/Release/dotnet-sdk-*.tar.gz /repos/bootstrap
  8. cp artifacts/x64/Release/Private.SourceBuilt.*.tar.gz /repos/bootstrap
  9. git clean -fdx
  10. git checkout -- .
  11. mkdir .dotnet
  12. tar -xzf /repos/bootstrap/dotnet-sdk-*.tar.gz -C .dotnet
  13. ./prep.sh --no-artifacts --no-bootstrap --no-sdk
  14. tar -xzf prereqs/packages/archive/Private.SourceBuilt.Prebuilts.*.tar.gz -C prereqs/packages/archive
  15. tar -xzf /repos/bootstrap/Private.SourceBuilt.Artifacts.*.tar.gz -C prereqs/packages/archive
  16. ./build.sh --with-packages prereqs/packages/archive --with-sdk .dotnet --online
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Dec 7, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Dec 7, 2023
@mthalman mthalman added area-crossgen2-coreclr and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Dec 7, 2023
@jkoritzinsky
Copy link
Member

It looks like this might actually be something RID related. For some reason (still investigating), the crossgen2 targets incorrectly determine that the ReadyToRun TargetOS is windows instead of linux. I'll try to figure out why that's happening, but that would lead to this type of failure.

@jkoritzinsky
Copy link
Member

I believe I found the cause. There's a bug in dotnet/sdk#28380 that flips Windows and Linux. It's only being discovered now because we're bootstrapping with an SDK that has the change. I'll make the change locally and validate.

@jkoritzinsky
Copy link
Member

Validated that dotnet/sdk#37407 fixes this issue.

@tmds
Copy link
Member

tmds commented Dec 11, 2023

There's a bug in dotnet/sdk#28380 that flips Windows and Linux. It's only being discovered now because we're bootstrapping with an SDK that has the change.

fyi, I think we've used SDKs with this change for bootstrapping before. The bug that is fixed is on a fallback branch for when the rid is not found in the rid graph. I think we're discovering it now because crossgen is now using a rid graph that is not yet updated for the non-portable rid.

@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Dec 11, 2023
@github-actions github-actions bot locked and limited conversation to collaborators Jan 11, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants