-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MigrationsRunner locks up on dotnet CLI first run #12365
Comments
I would've tried to debug this a little more myself, but I am unable to do so as it's impossible to get my hands on a debug build by compiling one. The dotnet-install script downloads an earlier version of the SDK, which, due to this issue, prevents dotnet from doing anything. |
@ZeroSkill1 - https://github.com/dotnet/sdk/blob/dfe7bcb5ffe75746f35553c9919c317384c7c97b/src/Cli/Microsoft.DotNet.Configurer/DotnetFirstTimeUseConfigurer.cs#L62 line of code is not inserted into 6.0.10 .NET SDK version. However, in that version of .NET SDK, NuGet introduced a migration logic that runs only once whenever a NuGet command is invoked. For example, I created an Ubuntu 22.04 WSL instance locally and unable to reproduce the issue.
In dotnet/sdk#29662 (comment) you have mentioned that Related to this, we fixed a logic error in the migrations code that is causing abandoned mutex exceptions in NuGet/NuGet.Client#4895. This fix has not been released yet and but is part of next servicing release. Can you please paste the call stacks here where you have identified NuGet is waiting for a mutex? That would be interesting because, NuGet waits maximum of 1 minute to obtain mutex. https://github.com/NuGet/NuGet.Client/blob/0966444b35d8a74fa25adf0368b983f389884377/src/NuGet.Core/NuGet.Common/Migrations/MigrationRunner.cs#L35 Can you use |
This comment was marked as outdated.
This comment was marked as outdated.
Rerunning the Call stack of
|
Looking at the call stack posted in the above comment, may be the while loop never breaks (I am guessing at this point). https://github.com/NuGet/NuGet.Client/blob/1e764301523e488a5ca79bdedf79df66f0a3ffbd/src/NuGet.Core/NuGet.Common/Migrations/Migration1.cs#L90-L94 while (parent != homePath)
{
pathsToCheck.Add(parent);
parent = Path.GetDirectoryName(parent);
} Please follow steps and let us know your feedback to troubleshoot this issue.
Are you able to reproduce this issue (meaning indefinite wait) after step no. 3? |
@ZeroSkill1 - adding to above comment, I created a new VM and executed the steps you mentioned in the issue description and noticed that restore succeeded. Would it be possible to run this test on another Ubuntu 22.04 machine? I just want to eliminate the possibility of any environment variables causing these issues. |
@kartheekp-ms After creating
Note: in Additionally, the restore does succeed on a fresh VM. |
After running a diff on the environment variables of the root user and the normal user, I found the following. Simply append a forward slash to export HOME=/home/{username}/; # this will trigger the bug
rm $HOME/.local/share/NuGet/Migrations -r; # make sure it runs migrations
dotnet nuget --version; # this will hang I did not notice this before, but in my VM this slash does not exist in the It looks like it causes an infinite loop. |
@ZeroSkill1 - I am able to reproduce the indefinite wait appending "/" to the HOME environment variable when logged in as a normal user. But I don't understand how you were able to reproduce this issue while logged in as root user or in other words can you please explain how you were able to reproduce this issue while logged in as root user? EDIT - I am unable to reproduce this issue on a new VM logged as root user whose HOME path is set to "/root". That being said I am able to reproduce the indefinite wait when HOME is set to "/root/" for the root user.
|
I meant that the issue does not affect the root user. |
We are experiencing the same thing during certain CI runs within dotnet/runtime.
@kartheekp-ms, would it be possible to use a global mutex instead? |
@steveisok - This root cause of this issue is presence of trailing forward slash to the $HOME environment variable. AFAIK this issue is different from dotnet/runtime#80619. I added a comment dotnet/runtime#80619 (comment) explaining why we choose local mutex over global mutex |
NuGet Product Used
dotnet.exe
Product Version
dotnet 6.0.10 (SDK 6.0.402)+
Worked before?
dotnet 6.0.9 (SDK 6.0.401)
Impact
I'm unable to use this version
Repro Steps & Context
Since MigrationsRunner was introduced in NuGet.Client (used in .NET 6.0.10, SDK 6.0.402+), I have not been able to use
dotnet
at all.I have opened an issue over at dotnet/sdk already and that led me to NuGet being the cause. As seen in the MigrationsRunner source, some kind of logic is executed that uses Mutexes. When
dotnet
runs for the first time, it executes the .Run() method, it seems to always hang and infinitely wait(?) for the Mutex to release, which never happens. As a result,dotnet
cannot continue with its first run logic.To reproduce:
dotnet
newer than 6.0.9 (SDK 6.0.401)$HOME/.dotnet
(becausedotnet
will not execute MigrationsRunner.Run() otherwise)dotnet new console -o HelloWorld
or evendotnet nuget
dotnet
will hang, spiking the CPU utilization on one core.What I've tried:
$HOME/.nuget
to see if that would be an issue (perhaps due to incorrect permissions)dotnet
binary, which showed the infinite Mutex waitVerbose Logs
No response
The text was updated successfully, but these errors were encountered: