Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable repeatable package restores using a lock file #5602

Closed
anangaur opened this issue Jul 14, 2017 · 72 comments
Closed

Enable repeatable package restores using a lock file #5602

anangaur opened this issue Jul 14, 2017 · 72 comments

Comments

@anangaur
Copy link
Member

anangaur commented Jul 14, 2017

Discussions should happen on this issue. Please link other issues with similar asks to this one.

July 2018 - Announced feature
December 2018 - Blog

@bording
Copy link

bording commented Aug 10, 2017

It appears that the current specification document is incorrect regarding how version ranges work with direct dependencies. It says that given the following package reference:

<PackageReference Include="My.Sample.Lib" Version="[4.0.0, 5.0.0]"/>

That it would pick the highest version, 5.0.0. This is not what I observe. Instead, that version range would resolve to the lowest version, 4.0.0.

Floating version numbers appear to be the only case where any sort of highest version logic is applied.

I also don't see anything indicating that highest would be expected in any of the documentation I've found:

https://docs.microsoft.com/en-us/nuget/consume-packages/dependency-resolution
https://docs.microsoft.com/en-us/nuget/consume-packages/package-references-in-project-files

@anangaur
Copy link
Member Author

Thanks @bording for reporting the discrepancy. This was an oversight from my end. I have corrected the same in the spec.

  1. If a range is specified - NuGet resolves to the lowest version specified in that range.
    E.g.

    Feed has only these versions for My.Sample.Lib: 4.0.0, 4.6.0, 5.0.0

    a. Range is specified:

    <PackageReference Include="My.Sample.Lib" Version="[4.0.0, 5.0.0]"/>

    NuGet resolves to the 4.0.0 here.

    b. Range is specified contd..

    <PackageReference Include="My.Sample.Lib" Version="[4.1.0, 5.0.0]"/>

    NuGet resolves to the 4.6.0 here.

@isaacabraham
Copy link

What happens if the NuGet.config file is on one machine but not on another, or is changed after the lock file is create - does the lock file still take effect?

@anangaur
Copy link
Member Author

IMO, sources definition should not matter for a lock file. Irrespective of which source the packages are coming from, finally it should be the same package that goes into the build.

To make sure whether its the same package or not I am thinking about putting the hash of the package in the lock file. Open to other ideas.

@isaacabraham
Copy link

@anangaur Sorry, didn't explain myself very well :-) What happens in the following situation: -

  1. You "turn on" repeatable builds in NuGet.config (mechanism still TBD).
  2. Your team start using NuGet, lock file is generated etc.
  3. Someone removes the flag in the NuGet.config.

What happens to the lock file? Does it still get used or is it discarded and deleted?

@anangaur
Copy link
Member Author

@isaacabraham Haven't put much thought on it but I am leaning towards generating the file by default and not as an option.
Let me spend sometime on it and thn I will come back with the proposal.

@mletterle
Copy link

why not just have a command line option to create or consume a project wide project.assets.json file at restore time?

Something like:

nuget restore path\to\something.sln -AssetsOut something.assets.json

and later (say on a build server)

nuget restore \path\to\something.sln --AssetsIn something.assets.json

It would be up to the consumer to regenerate/edit the assets file and check it into source control and to configure their builds to use it.

This would seem to allow the requested functionality with the minimum amount of changes....

@anangaur
Copy link
Member Author

Asset file is not just the restore graph but has multiple overloaded functionality and lot of contents irrelevant to a typical lock file. For example it contains all the warnings/errors applicable for the project restore. In addition to this, the assets file is not easy to understand or modify. (IMO even the lock file should not be manually edited). I have seen different assets file getting generated for the same dependencies (the lines get re-ordered) and hence it will be difficult to manage the merge conflicts. Assets file in the current form will not be able to handle floating versions.
However the current idea is very similar to generating a lock file by refactoring out the relevant contents from assets file that gets generated in the obj directory. I would rather solve this issue using a lock file mechanism than re-using the assets file and then live to support the backward compatibility in the long run :)

@mletterle
Copy link

True, just having a packages.config as input with specific, locked versions would be enough, having the ability to generate that file on restore would go a long way. packages.config is documented and presumably will be supported for the foreseeable future.

@ap0llo
Copy link

ap0llo commented Feb 17, 2018

I would really like to have this feature in the NuGet client. Are there any updates (or perhaps even a roadmap) you can share?

@NasAmin
Copy link

NasAmin commented Mar 4, 2018

This would be a great feature for nuget to support CI/CD workflows. Is there any ETA on when this feature will be available?

@anangaur
Copy link
Member Author

anangaur commented Mar 20, 2018

@forki @isaacabraham Wanted to bring the conversation here :) The current proposal is having the lock file at project level with the intention to lock the transitive dependencies per project. However, I have heard the problems it could have at runtime and I am contemplating to bring it at solution level.

Currently NuGet restore does a project level restore - may be that has to change if we want to bring the lock file to solution level? @jainaashish

@isaacabraham
Copy link

@anangaur hi :-)

OK. I'll try to (briefly) outline some of the reasons that come to mind why I think locking dependencies at either project or solution level will be a mistake. I'm sure @forki will have his own thoughts as well.

  1. If you lock at project level, at runtime, you'll be in the same precarious position you are now vis-a-vis dependency management i.e. you won't be able to guarantee that Project A and Project B both depend on the same version of the same dependency. At runtime, things might work, they might not.

  2. If you lock at solution level, you'll still probably run into problems. I know many people that have large code bases with multiple solutions that share projects in the same repository. 99/100 times, they'll want those solutions to have the same exact dependencies for consistency and expected behaviour. They won't want to open one solution and get one set of versions, then open another solution and get another set for the same project.

Alternatively, consider the situation where you e.g. have two solutions, one for your "main" codebase and another for integration and unit tests (which I have seen before). Do you really want to maintain a separate dependency chain for both of them? What happens when they get out of sync?

  1. Remember that some developers work outside of projects - particularly in the F# community, where exploratory scripts are a key part of their development process. Coupling dependencies to projects or solutions effectively kills working with scripts (or at least makes it much more challenging), in which you don't want (or need) a solution or project file at all. Instead, you just want to specific a set of dependencies and to bring them down onto the file system, and then reference then from a script.

In summary

Ask yourself this - how often in a repository do you explicitly want different versions of the same dependency? I suppose I would say this (because of my involvement with Paket) but decoupling yourself from projects and solutions will free you from all of these issues. Instead, consider pinning dependencies at the folder level (typically repository level). You then get consistency across all of your projects and solutions and scripts, because there's there's only one dependency graph across the whole repo.

For those cases where you need to go "outside the box" and must have different dependency graphs within a repo, consider something like Paket's Groups feature, which allows you to explicitly have distinct dependency chains (with their own section in the lock file). However, this is the exception to the rule - it doesn't happen very often.

Just my two cents - you may well feel differently, and it's entirely your choice how you proceed. Good luck! :-)

@ap0llo
Copy link

ap0llo commented Mar 20, 2018

I agree with @isaacabraham. I know of code bases for a single application split into multiple solutions with some of these solutions even overlapping, so locking dependencies neither at project level nor at solution level works in such scenarios.
Having a lock file at repository level (or a arbitrary file system subtree) sounds sensible to me

@anangaur
Copy link
Member Author

@isaacabraham Thanks for the detailed reply. I am fully in agreement in what you mentioned above. This is something required as part of the following:

https://github.com/NuGet/Home/wiki/Manage-allowed-packages-for-a-solution-%28or-globally%29

This issue is primarily to lock down the transitive dependencies (at install time) so that restores are repeatable across space and time. I do see both of these workitems are related but I am trying to kind of segregate these out and attack one problem at a time.

Right now the proposal is to have NuGet generate the lock file at the time of install at project level as that is how NuGet restore works but I have been hearing a lot of voices to make the lock file generated at solution level. From twitter and from Paket experience, I understand that's your recommendation too.

@isaacabraham
Copy link

isaacabraham commented Mar 20, 2018

@anangaur Hi. Sorry - but no, that's not my recommendation really. My comment just above states:

decoupling yourself from projects and solutions will free you from all of these issues. Instead, consider pinning dependencies at the folder level (typically repository level). You then get consistency across all of your projects and solutions and scripts, because there's there's only one dependency graph across the whole repo.

So I hope I'm clear here - working at either projects or solutions will not, in my opinion, provide a satisfactory solution that is either simple to understand, consistent and repeatable.

Again, though, I'll repeat: That's my opinion. It's entirely up to you how you proceed.

@anangaur
Copy link
Member Author

Sure. Thanks for the explanation. Will keep your recommendation in mind while I iterate over it. I will update the thread as we make progress. Appreciate your input here.

@forki
Copy link

forki commented Mar 21, 2018

my recommendation:

  • Start with cleaning up terminology. Define "restore", "install", "update" for yourself and try to use the terms consistently. Define what you think a "lock file" is.
  • Don't separate features that clearly belong together like locking and the question on what level to lock.
  • Look at other package managers like bundler, cargo, ... - not just npm or yarn.
  • Start to look at paket. we applied the ideas from bundler, npm, yarn (and to some extent cargo) to the specifics of the .NET framework (with target framework monikers, assembly binding redirects, dotnet core, ...)
  • Try to understand what paket groups are. What problems are they solving? Why did the paket people needed to introduce them? (Paket 2.0 was released Sep 2015, "groups" are old!)

@vivainio
Copy link

I want to pile on "please don't introduce any functionality at solution level". Single solution typically represents a very small part of a bigger product that wants to harmonize the dll versions. Solutions should remain a superficial "ide convenience" feature instead of getting critical new responsibilities.

@anangaur
Copy link
Member Author

@springy76 My apologies. My earlier suggestion was misplaced. I guess, for now, you can try the deletion or what @jainaashish suggested. We will investigate the issue and get back.

@springy76 , @Flavien Can you tell us more details like SDK versions (and repro steps)?

@ericnewton76
Copy link

ericnewton76 commented Dec 28, 2018

EDIT: Damn I wish I'd have seen this issue sooner. First I saw of this was the blog post on Nuget.org. Its great you guys are so much more transparent about things but I have a real bone to pick with y'all on the Nuget team.

Omg. Whats the point of checkin of package.json.lock if you're not going to use it by default! We need packages to be locked down at the time the developer picks a package via install or updates via update.

Do you guys ever check out npm? They did this several years ago via npm shrinkwrap and enhanced it over a year ago to always write out a lock file to fix the same scenario... repeatable builds... even repeatable on developers machines. They use the package.lock when doing npm restore and they even created an npm ci command that bypasses all package dependency resolutions in favor of just restoring the exact package dependency tree.

It feels like the Nuget team is so opinionated about things that keep being proven wrong again and again. I've had my clashes with various members in the past about issues like this and I'm continually overruled then vindicated several months if not year or so later on issues.

I realize Nuget has some significant challenges especially with the multitude of platforms and whatnot. But this particular feature is a no-brainer guys...

Should be OPT-OUT if not desired and I can't imagine any scenario where that is legitimately the case.

Now its baked in as OPT-IN so everybody keeps going forward slamming their heads into their desks while the Build server starts installing unwanted new versions of DLLs that are different from the developer's machines...

Here's how its gonna go:
Developer: "Works on my machine!"
Manager: "It sure does... damn... what the heck???"
Developer 2: "Did you set that option in Nuget to only restore the exactly whats in package.lock file so that new versions of some of the transient dependencies dont get installed?"
Developer: "...Wait... what? I thought having the lock file would do that... I already had to set something in the CSPROJ file to turn this on."

@ericnewton76
Copy link

ericnewton76 commented Dec 28, 2018

And to be clear:

package lock files should only be generated at INSTALL or UPDATE time. When a restore happens, and a package lock doesnt exist (solve issue of old libraries) then CREATE IT. If this happens on build server, nothing you can do about it. If this happens on DEV machine, it shows up as new file. (Except on TFS where you HAVE to remember to promote the thing... another id10tic feature) You should have to OPT-OUT of this behavior.

Secondly I recommend, like npm restore, to only install exactly whats in the package lock file. npm ci command was created to make this process even faster by not evaluating any dependency. It simply deletes node_modules and restores EXACTLY whats in package.lock. For nuget, the restore command should ALWAYS respect the package.lock file and so that if run on the build system and it cant restore EXACTLY whats in packages.json.lock, because a transient dependency exact version went unlisted or similar (hash fail maybe?) then it should fail the build, so developers can evaluate the situation, come back to their developer machine and effectively evaluate the situation, by themselves running nuget restore which precisely matches up the packages to packages.lock file. What I'm saying is... the same thing should happen on my developer machine as on the build server... otherwise its nondeterministic even at checkin. A transitive dependency can update its version literally one second after checkin, and bam the build server is pulling newer version. And you've made this OPT-IN so that developer may or may not end up replicating the build server package tree. So again whats the point?

Maybe theres more to it, I cant imagine it... thats 99% of developer intended workflow regarding a package manager in relation to the build/release phase and SHOULD utilize LOCKED package modes. You made this opt-in instead of opt-out so the build output of a lot of projects will continue to be nondeterministic as time moves on which is just silly.

TL;DR the RESTORE should ALWAYS respect package.lock thats why it exists. INSTALL and UPDATE should always WRITE and UPDATE the dependency versions within packages.lock if needed, and any changes get exposed in the VCS checkin. If you didnt intend for that to happen... DONT CHECKIN THE PACKAGES.LOCK file! Very simple. You guys have made it mind numbingly irritating by now having to double check a csproj setting upon install/update/restore to even write the file, AND you make it so we have to now go into every build definition and double check that this feature is turned on.

@forki
Copy link

forki commented Dec 28, 2018

@ericnewton76 check out https://fsprojects.github.io/Paket/ - .NET has this for years.

@anangaur
Copy link
Member Author

@ericnewton76 Thanks for your detailed feedback. Let me try to answer your queries (as I understand them) one by one and we can delve into specifics, as required.

Why opt-in and and not have it opt-out? Why the presence of lock file is enough?

We did not want to suddenly start producing a new file with an incremental release i.e. as part of 4.x. In addition, the lock file feature is not complete unless we solve the problem of centrally managing package versions and thereby having a central lock file per repo/solution. Once done, we will evaluate to make the lock file default.

Btw, the presence of lock file is enough to opt in to this feature. The property RestoreWithLockFile is only to bootstrap into this feature. See the note section here for more details.

Install and Update should write to lock file and Restore should always restore from the lock file

Today there are various ways to install and update the packages - not just from various tools but also directly writing the PackageReference node into the csproj file:

  • Directly writing PackageReference node into the csproj file is so easy and has become very common. The intellisense in VS and VS code also assists you today in directly editing the csproj file to include a PackageReference node. Once written into the csproj, the package is installed using restore. This might sound confusing but NuGet in PackageReference world does not really have an install. Everything is a restore.
  • The various tools while installing the package writes the PackageReference node into the csproj file (There are some evaluations before writing into the project file, though). And then restore actually brings the package in the project's context (the so called install)

To summarize, there were 2 options:
Option 1: Continue with the current NuGet model of installs and restores and introduce thi feature in most non-breaking way as possible.
Option 2: Re work on the notions of installs vs. restores and break the current way of adding a package by directly writing PackageReference node into the project file as the restore immediately after such writes/edits of PackageReferences would fail.

We chose Option 1. One can get the Option 2 behavior by setting the locked mode to true. Once set, restore will fail if it cannot get the exact packages as mentioned in the lock file. I feel this is what you want but as default?

nuget ci similar to npm ci

This seems like a good idea. We can definitely evaluate this option if you and others feel this would be useful and if this helps in improved performance of restore (theoretically this has promise but we would need to run a few experiments to understand the quantum of gain).

Do let me know if this answers your concerns? Happy to discuss more and/or delve deeper into any of the topics elaborated above.

@ericnewton76
Copy link

ericnewton76 commented Dec 29, 2018

Thanks for the reply. Sorry if my tone seemed too adversarial. I can be passionate about this stuff and its hard to guard my words sometimes.

We did not want to suddenly start producing a new file with an incremental release i.e. as part of 4.x. In addition, the lock file feature is not complete unless we solve the problem of centrally managing package versions

Granted, npm formally brought on the package-lock.json by moving to npm v5... read their release notes and releases after v5.0.0 It should be a very interesting read for another package manager in a slightly different ecosystem.

However, if you introduce this lock file in v4.8 then whats the difference really? It seems like MS in general is afraid to kick the major version up due to marketing concerns or something else instead of technology needs. I won't say thats what Nuget team does but it seems to be the norm. Nothing wrong with Nuget v10... LOL. Anybody that complains just doesnt understand how software is built then.

Directly writing PackageReference node into the csproj file is so easy and has become very common

That's fine... that'll happen on the developer's machine and the RESTORE command should complain like a stuck pig. At least squeal with a warning message that says a new packagereference is found and should be installed properly to have a proper dependency graph analysis performed. And then go ahead and basically do an install. When nuget restore --ci is running and the same occurs, then fail the build. Call a time out. And say "this is for your own good," and point them right at these messages as to why failing the build is necessary when a package lock file exists and there's packages being installed in a build server scenario that shouldn't be there. Again, complainers about this don't understand the problem until they smack into it when a transitive dependency floats on them and breaks the runtime behavior somehow.

Problem is:
Option 1 is still non-deterministic builds.
Option 2 is better but still requires that specific opt-in that should be a default, and thus non-deterministic builds

The goal is a deterministic build. Both of your current options listed don't solve that... they just add more configuration switches to underlying mechanisms that don't help you achieve a stable build. And when the feature releases formally, now you have this extra cruft that has to be supported ad-infinium to preserve that exact behavior when it might not be true later.

Probably a difference of opinion here... you guys are trying to go for least disruptive change for something that will 100% make their lives better, but by not jumping in feet first, you're equivocating on a feature that is a must-have. In addition, did you try this out in-house first? Did you guys scratch their heads saying "omg! this would be bad for it to precisely match up my dependencies! and when it doesnt match, its notifying me that my hand edited package reference accidentally checked in due to TFS lunacy is warning me that something is awry!" I have a feeling it was the opposite. Please note I'm trying to be humorous here, to keep this deep subject in the realm of amusement.

You have to honestly ask yourself, would a developer whos tested precise versions of packages for days, possibly weeks, on his own machine, be okay with an algorithm making decisions for him about inaccurate package versions by library developers that probably will introduce inadvertent breaking changes into his runtime when he presses that build button to release to production on a thursday night at 10pm?

nuget restore --ci or nuget ci
Just check out npm ci and some of the release notes about it. Makes perfect sense, and has legitimately locked down that crazy world of javascript package management! Quite amazing if you ask me... considering how fast things move over there too.

@rrelyea rrelyea modified the milestones: 5.0, 4.9.0 Jan 14, 2019
@rrelyea
Copy link
Contributor

rrelyea commented Jan 14, 2019

This didn't get documented in the release notes for 4.9.0 because it was mistakenly in 5.0 milestone.
(working to fix)

For now, setting to 4.9.3 - will fix release notes in 4.9.0 when we ship release notes for 4.9.3 and then reset this back to 4.9.0

Will be closing this issue. Please spin off any follow up discussions in other issues.

@rrelyea rrelyea modified the milestones: 4.9.0, 4.9.3 Jan 14, 2019
@rrelyea rrelyea closed this as completed Jan 14, 2019
@rrelyea rrelyea changed the title [PackageReference] Enable repeatable package restores for PackageReference based projects Enable repeatable package restores using a lock file Jan 14, 2019
@Flavien
Copy link

Flavien commented Jan 29, 2019

Wasn't the problem referenced above the same problem as this?

@fowl2
Copy link

fowl2 commented Mar 13, 2019

are there plans for a nugetrestore --locked-mode? Without this it seems like it's impossible to use the lock file on Azure Pipelines for framework/non-dotnetcore projects.

@anangaur
Copy link
Member Author

@fowl2 Can you try msbuild instead:

msbuild.exe /t:restore /p:RestoreLockedMode=true

@Flavien
Copy link

Flavien commented Apr 5, 2019

I'm still struggling with this error:

error NU1403: The package System.Collections.Specialized.4.3.0 sha512 validation failed. The package is different than the last restore.

(happens with any package randomly)

I'm using NuGet 5.0.0.6 and have cleared my caches and fallback folders numerous times. I am still getting this error on my CI build no matter what I try.

Any ideas what's wrong?

@anangaur
Copy link
Member Author

anangaur commented Apr 5, 2019

Can specify the exact steps? Or may be provide a repro?

@anangaur
Copy link
Member Author

anangaur commented Apr 5, 2019

It would also be good to know the sources you are using. One of the reasons could be that the sources have different packages (with different SHA) and depending on which source was used to restore, you might see failures.

@Flavien
Copy link

Flavien commented Apr 6, 2019

Here is a repro: https://github.com/Flavien/nuget-lockfile-repro.

I have generated the lock file by building the project on Visual Studio (Windows) 16.0.0.0 Preview 5.0. NuGet version is 5.0.0.

When I clone this on my Ubuntu WSL and run dotnet restore --locked-mode, I get this:

flavien@LAPTOP-FLAVIEN:~/NuGet/NuGetLockFile$ dotnet restore --locked-mode
  Restore completed in 163.45 ms for /home/flavien/NuGet/NuGetLockFile/NuGetLockFile.csproj.
/home/flavien/NuGet/NuGetLockFile/NuGetLockFile.csproj : error NU1403: The package Microsoft.AspNetCore.2.2.0 sha512 validation failed. The package is different than the last restore.
  Restore failed in 5.25 sec for /home/flavien/NuGet/NuGetLockFile/NuGetLockFile.csproj.

Here is dotnet --info on my WSL install:

.NET Core SDK (reflecting any global.json):
 Version:   2.2.202
 Commit:    8a7ff6789d

Runtime Environment:
 OS Name:     ubuntu
 OS Version:  18.04
 OS Platform: Linux
 RID:         ubuntu.18.04-x64
 Base Path:   /usr/share/dotnet/sdk/2.2.202/

Host (useful for support):
  Version: 2.2.3
  Commit:  6b8ad509b6

.NET Core SDKs installed:
  2.2.202 [/usr/share/dotnet/sdk]

.NET Core runtimes installed:
  Microsoft.AspNetCore.All 2.2.3 [/usr/share/dotnet/shared/Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.App 2.2.3 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 2.2.3 [/usr/share/dotnet/shared/Microsoft.NETCore.App]

To install additional .NET Core runtimes or SDKs:
  https://aka.ms/dotnet-download

The only source I have on Visual Studio is: nuget.org (https://api.nuget.org/v3/index.json).
Sources on WSL:

Registered Sources:

  1.  https://www.nuget.org/api/v2/ [Enabled]
      https://www.nuget.org/api/v2/

@StingyJack
Copy link
Contributor

We did not want to suddenly start producing a new file with an incremental release i.e. as part of 4.x

@anangaur - Yet it was OK to start breaking/normalizing version numbers in a minor release?

@MarkKoz
Copy link

MarkKoz commented Nov 29, 2019

There seem to be discrepancies across platforms or systems. I locked on Windows 10 and can restore fine on that same system. However, restoring on Arch Linux fails for System.Collections.NonGeneric. If I restore and let it update the lock file, then I can see it produces a different hash for the aforementioned dependency. Furthermore, I use an Ubuntu agent for my CI pipeline and there System.Collections.NonGeneric fails too, but so does Microsoft.NETCore.Platforms (which is fine on Arch).

Edit: solved by following #7921 (comment)

@ghost
Copy link

ghost commented Aug 2, 2021

@MarkKoz Just a heads up, the package Dotnet.ReproducibleBuilds.Isolated aims to configure these msbuild settings for you to avoid non-reproducibility issues. Among other things, it turns off that hidden nuget cache you ran into.

@MarkKoz
Copy link

MarkKoz commented Aug 2, 2021

Thanks @aaronla-ms. I'm glad MS is investing in solving this issue. I'll look into adding this to my projects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests