Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use sync.OnceValue for various regular expressions, require go1.21 #15

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Jul 15, 2024


use sync.OnceValue for various regular expressions, require go1.21

Using regex.MustCompile consumes a significant amount of memory when
importing the package, even if those regular expressions are not used.

This changes compiling the regular expressions to use a sync.OnceValue
so that they're only compiled the first time they're used.

There are various regular expressions remaining that are still compiled
on import, but these are exported, so changing them to a sync.OnceValue
would be a breaking change; we can still decide to do so, but leaving
that for a follow-up.

It's worth noting that sync.OnceValue requires go1.21 or up, so raising
the minimum version accordingly.

Before / After (on the docker CLI (GODEBUG=inittrace=1 ./build/docker)):

init github.com/distribution/reference @11 ms,   0.63 ms clock, 414456 bytes, 3599 allocs
init github.com/distribution/reference  @9.8 ms, 0.44 ms clock, 236680 bytes, 1398 allocs

@thaJeztah thaJeztah self-assigned this Jul 15, 2024
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
    Error: fuzz_test.go:11:14: unused-parameter: parameter 't' seems to be unused, consider removing or renaming it as _ (revive)
        f.Fuzz(func(t *testing.T, data string) {
                    ^

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Using regex.MustCompile consumes a significant amount of memory when
importing the package, even if those regular expressions are not used.

This changes compiling the regular expressions to use a sync.OnceValue
so that they're only compiled the first time they're used.

There are various regular expressions remaining that are still compiled
on import, but these are exported, so changing them to a sync.OnceValue
would be a breaking change; we can still decide to do so, but leaving
that for a follow-up.

It's worth noting that sync.OnceValue requires go1.21 or up, so raising
the minimum version accordingly.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Copy link

codecov bot commented Jul 15, 2024

Codecov Report

Attention: Patch coverage is 94.73684% with 1 line in your changes missing coverage. Please review.

Project coverage is 84.81%. Comparing base (ff14faf) to head (4ca1403).

Files Patch % Lines
reference.go 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #15      +/-   ##
==========================================
+ Coverage   83.71%   84.81%   +1.09%     
==========================================
  Files           5        5              
  Lines         393      316      -77     
==========================================
- Hits          329      268      -61     
+ Misses         54       38      -16     
  Partials       10       10              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@thaJeztah
Copy link
Member Author

/cc @tonistiigi

Copy link
Member

@milosgajdos milosgajdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting. Out of curiosity. what triggered this change? Who has discovered/highlighted the memory pressure issue?

@thaJeztah
Copy link
Member Author

Yes; regexes being "hungry" on resources is a known issue in general, which is also why Kir made various pull requests some time ago to try to reduce their use (e.g. docker/go-units#40), but this one cam through @tonistiigi who found that docker/buildx forgot to update docker/registry to the latest version, and as a result the distribution/reference package was loaded twice (once from docker/distribution/reference` and once from the new module.

For a long-lived process / daemon, it's probably less problematic, but in the CLI, this resulted in nearly 2MB to be used just by importing those packages; here's from a Slack thread on that;

reference package(s) allocate nearly 2MB of memory inside init()

init github.com/containerd/containerd/reference @68 ms, 0.002 ms clock, 776 bytes, 10 allocs
init github.com/distribution/reference @68 ms, 1.5 ms clock, 414456 bytes, 3599 allocs
init github.com/docker/distribution/reference @70 ms, 2.2 ms clock, 1368728 bytes, 11434 allocs

So this is an attempt at reducing such cases, and making the regexes compiled on first use.

@milosgajdos
Copy link
Member

@thaJeztah now that #16 has been merged, wanna open this for review?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants