Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache tolerance for old archives #5564

Merged
merged 10 commits into from
Jul 9, 2023
Merged

Cache tolerance for old archives #5564

merged 10 commits into from
Jul 9, 2023

Conversation

arcanis
Copy link
Member

@arcanis arcanis commented Jul 3, 2023

What's the problem this PR addresses?

Each time we make a change to the libzip or zlib implementation, all generated archives typically see the compression output changed, with no notable changes on the actual content of the files. When the local cache is used this is reflected by git marking all cache files as changed, and requiring to push them anew. This is a very heavy process, especially since the compression prevents git from efficiently diffing the files. The situation is a little better with compression=0 (the new default), but not all projects can use it right away.

How did you fix it?

This diff makes Yarn tolerant to archives generated by older versions, as long as they have been generated by at least the cache version N (called the "checkpoint version"). The checksum is still validated.

This logic isn't enabled by default, by fear of adverse interactions with particular setups (in particular --check-cache, which is currently recommended for open-source projects using zero-installs; probably not many, but better to be careful so close from the next major release).

Checklist

  • I have set the packages that need to be released for my changes to be effective.
  • I will check that all automated PR checks pass before the PR gets reviewed.

Copy link
Member

@merceyz merceyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is only enabled for projects using the local cache

This will break use cases where, for example, you use the global cache on the CI and the local cache on your machine.
Locally installs will pass while on the CI it will throw.

@arcanis
Copy link
Member Author

arcanis commented Jul 4, 2023

I thought about this, and the only solution I could come up with was to introduce a new setting, cacheMigrationMode, which selects whether Yarn accepts old files or not. In the case you specify, you'd either set the CI value to required-only or your local value to always, and it should work.

This setting also allows us to let Yarn reuse files from different compression levels, so it has value in multiple use cases.

@arcanis arcanis requested a review from merceyz July 4, 2023 12:54
Copy link
Member

@merceyz merceyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks yarn --check-cache as the new Yarn version with a different cache key can't reproduce the cache entry of the previous Yarn version.

@arcanis
Copy link
Member Author

arcanis commented Jul 9, 2023

After some discussion I agree enabling this by default is too risky so close from the major release, due to potential issues with setups people may already have (in particular, the story with --check-cache isn't clear).

I turned back the default to always in every situation, but projects can still opt-in to required-only depending on their needs.

@arcanis arcanis merged commit 506ded5 into master Jul 9, 2023
15 checks passed
@arcanis arcanis deleted the mael/cache-version-gentle branch July 9, 2023 20:27
@arcanis arcanis mentioned this pull request Sep 12, 2023
3 tasks
arcanis added a commit that referenced this pull request Sep 13, 2023
**What's the problem this PR addresses?**

- CRA broke when [we started required that `--cwd` be put before any
other
argument](#5600 (comment)).
CRA is mostly deprecated / unmaintained by now, but it has a decent
amount of downloads (more than `create-next-app`), so it doesn't cost us
much to special-case a fix for the v4 that we can then drop in v5.

- When a machine had a hot cache for package X in both cache versions A
and B (each variant having its own checksum), when migrating a project
cache version from A to B, Yarn would mistakenly try to validate the
variant B using the checksum from variant A.

- We already have a mechanism to tolerate checksum changes when going
from one cache key to the next, but before
#5564 we used to retrieve the cache
key from the file name, whereas we now retrieve it from the lockfile
instead. A branch of code that relied on the assumption that the cache
key could be checked later became invalid, hence the problem.

**How did you fix it?**

- The `--cwd` flag is now allowed (as a special case) at the end of
`yarn add`.

- I refactored the "is this locator compatible with the current cache
key" function outside of `getLocatorPath`.

**Checklist**
<!--- Don't worry if you miss something, chores are automatically
tested. -->
<!--- This checklist exists to help you remember doing the chores when
you submit a PR. -->
<!--- Put an `x` in all the boxes that apply. -->
- [x] I have read the [Contributing
Guide](https://yarnpkg.com/advanced/contributing).

<!-- See
https://yarnpkg.com/advanced/contributing#preparing-your-pr-to-be-released
for more details. -->
<!-- Check with `yarn version check` and fix with `yarn version check
-i` -->
- [x] I have set the packages that need to be released for my changes to
be effective.

<!-- The "Testing chores" workflow validates that your PR follows our
guidelines. -->
<!-- If it doesn't pass, click on it to see details as to what your PR
might be missing. -->
- [x] I will check that all automated PR checks pass before the PR gets
reviewed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants