Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Big performance change) Do not run lints that cannot emit #125116

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

blyxyas
Copy link
Member

@blyxyas blyxyas commented May 14, 2024

Before this lint, adding a lint was a difficult matter because it always had some overhead involved. This was because all lints would run, no matter their default level, or if the user had #![allow]ed them. This PR changes that. This change would improve both the Rust lint infrastructure and Clippy, but Clippy will see the most benefit, as it has about 900 registered lints (and growing!)

So yeah, with this little patch we filter all lints pre-linting, and remove any lint that is either:

  • Manually #![allow]ed in the whole crate,
  • Allowed in the command line, or
  • Not manually enabled with #[warn] or similar, and its default level is Allow

As some lints need to run, this PR also adds loadbearing lints. On a lint declaration, you can use the [loadbearing: true] marker to label it as loadbearing. A loadbearing lint will never be filtered.

Phase 1/2 Not all lints are being filtered, I'm still working on it, but this branch still gives us about a 2% improvement, so why not merge it already.

Fixes #106983

@rustbot
Copy link
Collaborator

rustbot commented May 14, 2024

r? @michaelwoerister

rustbot has assigned @michaelwoerister.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels May 14, 2024
@rustbot
Copy link
Collaborator

rustbot commented May 14, 2024

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

Some changes occurred in src/tools/clippy

cc @rust-lang/clippy

@blyxyas
Copy link
Member Author

blyxyas commented May 14, 2024

cc @nnethercote @Kobzol, the perf wizards. Could you please give this PR a look and tell me if there are any obvious performance issues on the filtering?

@blyxyas blyxyas marked this pull request as draft May 14, 2024 11:23
@matthiaskrgr
Copy link
Member

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 14, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request May 14, 2024
…r=<try>

(Big performance change) Do not run lints that cannot emit

Before this lint, adding a lint was a difficult matter because it always had some overhead involved. This was because all lints would run, no matter their default level, or if the user had `#![allow]`ed them. This PR changes that. This change would improve both the Rust lint infrastructure and Clippy, but Clippy will see the most benefit, as it has about 900 registered lints (and growing!)

So yeah, with this little patch we filter all lints pre-linting, and remove any lint that is either:
- Manually `#![allow]`ed in the whole crate,
- Allowed in the command line, or
- Not manually enabled with `#[warn]` or similar, and its default level is `Allow`

I open this PR to receive some feedback, mainly related to performance. We have lots of `Lock`s, `with_lock` and similar functions (also lots of cloning), so the filtering performance is not the best.

In an older iteration, instead of doing this in the parsing phase, we developed a visitor with the same function but without so many locks, would reverting to that change help? I'm not sure tbh.
@bors
Copy link
Contributor

bors commented May 14, 2024

⌛ Trying commit 7606f89 with merge cc1d40f...

@rust-log-analyzer

This comment has been minimized.

@Kobzol
Copy link
Contributor

Kobzol commented May 14, 2024

@lqd haven't you tried something like this before? 🤔

@bors
Copy link
Contributor

bors commented May 14, 2024

☀️ Try build successful - checks-actions
Build commit: cc1d40f (cc1d40f134ee8336cbb7c7561deaed4aa5906e0e)

@rust-timer

This comment has been minimized.

@lqd
Copy link
Member

lqd commented May 14, 2024

@lqd haven't you tried something like this before? 🤔

We've tried a few different things yes, and so has blyxyas -- it maybe wasn't exactly like this, but I encountered annoyances like: some slow const eval loadbearing lint that shouldn't be ignored, lints that would be allowed unexpectedly because cargo allows lints unconditionally on dependencies (arguably the most common usage, and where perf gains would show up AFAICT) but some may trigger FCWs or are required to lint on dependencies despite being allowed, et cetera.

Refactoring and fixing all these were too costly compared to the gains at the time, as rustc's lints were fast enough on dependencies, also a "rarer" use-case. That being said, we've added and uplifted more lints since then, including possibly costly ones like the non local impls one, and the situation may also be different for clippy itself (but we won't see that in the perf.rlo results, only locally with the clippy dedicated commands IIUC)

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (cc1d40f): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.3% [0.2%, 0.5%] 25
Regressions ❌
(secondary)
0.4% [0.1%, 1.6%] 11
Improvements ✅
(primary)
-0.4% [-0.4%, -0.4%] 1
Improvements ✅
(secondary)
-0.1% [-0.1%, -0.1%] 1
All ❌✅ (primary) 0.3% [-0.4%, 0.5%] 26

Max RSS (memory usage)

Results (primary 2.3%, secondary -0.5%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.3% [2.3%, 2.3%] 1
Regressions ❌
(secondary)
3.6% [2.0%, 5.3%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-4.7% [-5.2%, -4.1%] 2
All ❌✅ (primary) 2.3% [2.3%, 2.3%] 1

Cycles

Results (secondary 2.4%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.4% [1.5%, 3.1%] 6
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 676.788s -> 676.098s (-0.10%)
Artifact size: 316.11 MiB -> 316.15 MiB (0.01%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels May 14, 2024
@Centri3
Copy link
Member

Centri3 commented May 14, 2024

The benchmark doesn't check clippy, right? As lqd hinted at as well? And without splitting allow-by-default rustc lints it does nothing without clippy, so I think this just shows how much time it takes to filter them (can someone else confirm this :3c)

Thus, basically nothing it seems :3 (So @blyxyas maybe the cloning is ok?)

@blyxyas
Copy link
Member Author

blyxyas commented May 15, 2024

The benchmark doesn't check clippy, right?

Yeah, the benchmarks currently doesn't check Clippy, that's why I'm currently benchmarking on a different server via SSH (A server that we got explicitly to benchmark Clippy). I'll post the results here when they arrive :)

Also, it currently doesn't check builtin lints because I'm having some issues checking that. That's also part of why I decided to open the PR, maybe someone has some idea (I'll see if I can read the previous attempts by lqd, maybe I can learn something from them)

EDIT: Seems like lqd hasn't pushed their attempts, I'll have to keep trying new approaches by myself.

@blyxyas
Copy link
Member Author

blyxyas commented May 15, 2024

Okis, here are the results (Wall time, Clippy)

Wall time


[ +0.45%, +190.50%]
+9.60% 81 (36)

[-73.80%, -0.44%]
-7.00% 77 (36)
❌,✅
[-73.80%, +190.50%]
+1.51% 158 (44)

Max RSS


[ +0.40%, +2.80%]
+1.17% 21 (16)

[ -5.15%, -0.44%]
-1.60% 39 (24)
❌,✅
[ -5.15%, +2.80%]
-0.63% 60 (31)

Instructions


[ +0.42%, +0.62%]
+0.52% 6 (4)

[ -1.82%, -0.32%]
-0.74% 13 (7)
❌,✅
[ -1.82%, +0.62%]
-0.34% 19 (11)

Cycles

[ +0.42%, +25.48%]
+4.33% 91 (38)

[-14.52%, -0.40%]
-3.33% 59 (32)
❌,✅
[-14.52%, +25.48%]
+1.31% 150 (43)

@blyxyas
Copy link
Member Author

blyxyas commented May 15, 2024

Those wall times are proof that this optimization has a lot of potential, the main drawback is that the filtering / parsing code is not fast enough, so in some scenarios that I'm not really able to determine exactly what do they have in common, the optimization goes backwards.

But a ~70% in Wall time, that's great and we should look more into it.

@Kobzol
Copy link
Contributor

Kobzol commented May 15, 2024

I wouldn't draw too many conclusions from these results, they seem to be quite unstable (there is also a 190% walltime regression). Note that even for PRs that don't have large perf. impacts, we can see ~30% walltime swings even on the stable benchmarking server (https://perf.rust-lang.org/compare.html?start=9e7aff794539aa040362f4424eb29207449ffce0&end=44fa5fd39a1d2af41bd7f43bc246a5e4f6d94696&stat=wall-time&nonRelevant=true).

@blyxyas
Copy link
Member Author

blyxyas commented May 15, 2024

I've changed the system, we're back to using visitors (I've benchmarked this new commit, it should have 0 regressions and about -0.66% improvement)

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 15, 2024
@bors
Copy link
Contributor

bors commented May 15, 2024

⌛ Trying commit 828cd60 with merge 68a9e31...

@rust-timer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Sep 3, 2024

⌛ Trying commit 2aaf2f2 with merge 50e9dfe...

bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 3, 2024
…r=<try>

(Big performance change) Do not run lints that cannot emit

Before this lint, adding a lint was a difficult matter because it always had some overhead involved. This was because all lints would run, no matter their default level, or if the user had `#![allow]`ed them. This PR changes that. This change would improve both the Rust lint infrastructure and Clippy, but Clippy will see the most benefit, as it has about 900 registered lints (and growing!)

So yeah, with this little patch we filter all lints pre-linting, and remove any lint that is either:
- Manually `#![allow]`ed in the whole crate,
- Allowed in the command line, or
- Not manually enabled with `#[warn]` or similar, and its default level is `Allow`

As some lints **need** to run, this PR also adds **loadbearing lints**. On a lint declaration, you can use the `[loadbearing: true]` marker to label it as loadbearing. A loadbearing lint will never be filtered.

**Phase 1/2** Not all lints are being filtered, I'm still working on it, but this branch still gives us about a 2% improvement, so why not merge it already.

Fixes rust-lang#106983
@bors
Copy link
Contributor

bors commented Sep 3, 2024

☀️ Try build successful - checks-actions
Build commit: 50e9dfe (50e9dfeeffe9b13a86fd7ecabf4846878e6846df)

@rust-timer

This comment has been minimized.

@lqd
Copy link
Member

lqd commented Sep 3, 2024

The previous results have been updated

image

(I'm sure the one regression was noise)

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (50e9dfe): comparison URL.

Overall result: ❌ regressions - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
1.9% [1.9%, 1.9%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results (secondary 3.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.0% [2.3%, 4.4%] 3
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Cycles

Results (secondary 3.3%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.3% [3.3%, 3.3%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 750.76s -> 750.368s (-0.05%)
Artifact size: 338.30 MiB -> 338.38 MiB (0.02%)

@rustbot rustbot removed S-waiting-on-perf Status: Waiting on a perf run to be completed. perf-regression Performance regression. labels Sep 3, 2024
@blyxyas
Copy link
Member Author

blyxyas commented Sep 7, 2024

I've done some benchmarks on the new commit on Clippy on server

Summary

Range Mean Count
Regressions 0.30%, 6.38% 0.84% 23
Improvements -4.89%, -0.30% -1.36% 89
All -4.89%, 6.38% -0.91% 112

Primary benchmarks

Benchmark Profile Scenario % Change Significance Factor
webrender-2022 clippy incr-full -4.66% 23.28x
webrender-2022 clippy full -4.19% 20.97x
ripgrep-13.0.0 clippy incr-unchanged -3.64% 18.21x
ripgrep-13.0.0 clippy incr-patched: println -3.48% 17.38x
regex-1.5.5 clippy incr-patched: Job -2.92% 14.59x
unicode-normalization-0.1.19 clippy incr-full -2.89% 14.46x
syn-1.0.89 clippy full -2.59% 12.94x
clap-3.1.6 clippy full -2.54% 12.68x
syn-1.0.89 clippy incr-full -2.52% 12.62x
unicode-normalization-0.1.19 clippy incr-patched: println -2.44% 12.18x
clap-3.1.6 clippy incr-full -2.40% 11.98x
cargo-0.60.0 clippy incr-full -2.33% 11.65x
unicode-normalization-0.1.19 clippy incr-unchanged -2.24% 11.18x
cargo-0.60.0 clippy full -2.23% 11.14x
unicode-normalization-0.1.19 clippy full -2.03% 10.13x
ripgrep-13.0.0 clippy full -1.97% 9.83x
serde_derive-1.0.136 clippy full -1.94% 9.69x
ripgrep-13.0.0 clippy incr-full -1.93% 9.63x
regex-1.5.5 clippy incr-full -1.91% 9.54x
serde_derive-1.0.136 clippy incr-full -1.91% 9.53x
diesel-1.4.8 clippy incr-patched: println -1.81% 9.07x
cranelift-codegen-0.82.1 clippy incr-full -1.74% 8.68x
cranelift-codegen-0.82.1 clippy full -1.55% 7.73x
diesel-1.4.8 clippy incr-unchanged -1.50% 7.49x
exa-0.10.1 clippy full -1.40% 6.99x
exa-0.10.1 clippy incr-full -1.36% 6.80x
hyper-0.14.18 clippy full -1.25% 6.24x
regex-1.5.5 clippy full -1.23% 6.14x
diesel-1.4.8 clippy full -1.19% 5.93x
diesel-1.4.8 clippy incr-full -1.18% 5.89x
image-0.24.1 clippy incr-full -1.16% 5.79x
html5ever-0.26.0 clippy full -1.13% 5.65x
hyper-0.14.18 clippy incr-unchanged 1.09% 5.43x
bitmaps-3.1.0 clippy incr-full -1.05% 5.27x
cargo-0.60.0 clippy incr-patched: println -1.01% 5.03x
image-0.24.1 clippy full -0.98% 4.90x
cranelift-codegen-0.82.1 clippy incr-patched: println -0.98% 4.90x
regex-1.5.5 clippy incr-patched: is valid cap letter -0.92% 4.58x
bitmaps-3.1.0 clippy full -0.90% 4.49x
exa-0.10.1 clippy incr-patched: printlns -0.88% 4.39x
html5ever-0.26.0 clippy incr-full -0.78% 3.89x
image-0.24.1 clippy incr-patched: println -0.75% 3.73x
stm32f4-0.14.0 clippy incr-patched: negate -0.64% 3.20x
webrender-2022 clippy incr-patched: println -0.62% 3.08x
clap-3.1.6 clippy incr-patched: println -0.61% 3.07x
regex-1.5.5 clippy incr-patched: byte frequencies 0.59% 2.93x
helloworld clippy incr-unchanged 0.55% 2.74x
bitmaps-3.1.0 clippy incr-patched: println 0.53% 2.64x
cargo-0.60.0 clippy incr-unchanged -0.50% 2.51x
helloworld clippy incr-patched: println 0.47% 2.34x
typenum-1.17.0 clippy incr-patched: add fn -0.44% 2.18x
helloworld clippy incr-full 0.41% 2.03x
serde-1.0.136 clippy incr-patched: println -0.39% 1.96x
serde-1.0.136 clippy incr-unchanged -0.39% 1.95x
libc-0.2.124 clippy incr-full -0.36% 1.78x
syn-1.0.89 clippy incr-patched: println -0.35% 1.76x
regex-1.5.5 clippy incr-patched: compile one -0.35% 1.74x
libc-0.2.124 clippy incr-patched: clone -0.34% 1.68x
serde-1.0.136 clippy full -0.32% 1.61x
exa-0.10.1 clippy incr-unchanged -0.30% 1.52x
hyper-0.14.18 clippy incr-patched: println 0.30% 1.51x

Secondary benchmarks

Benchmark Profile Scenario % Change Significance Factor
deep-vector clippy full 6.38% 31.92x
regression-31157 clippy full -4.89% 24.47x
regression-31157 clippy incr-full -4.75% 23.74x
unused-warnings clippy incr-full -1.94% 9.72x
wg-grammar clippy full -1.76% 8.79x
ripgrep-13.0.0-tiny clippy full -1.74% 8.70x
tuple-stress clippy incr-unchanged -1.60% 7.99x
ucd clippy incr-unchanged -1.51% 7.54x
wg-grammar clippy incr-full -1.47% 7.33x
issue-46449 clippy incr-unchanged 1.29% 6.43x
coercions clippy incr-full -1.28% 6.38x
unused-warnings clippy full -1.23% 6.13x
deeply-nested-multi clippy incr-full -1.17% 5.86x
deeply-nested-multi clippy full -1.17% 5.84x
issue-46449 clippy incr-full 0.97% 4.87x
regression-31157 clippy incr-patched: println -0.94% 4.72x
tt-muncher clippy incr-unchanged 0.89% 4.45x
coercions clippy incr-unchanged -0.86% 4.30x
ctfe-stress-5 clippy incr-full -0.81% 4.05x
coercions clippy full -0.78% 3.88x
unused-warnings clippy incr-patched: dummy fn -0.74% 3.68x
regression-31157 clippy incr-unchanged -0.73% 3.65x
unused-warnings clippy incr-unchanged 0.71% 3.54x
tt-muncher clippy full 0.67% 3.36x
externs clippy incr-full -0.67% 3.35x
wf-projection-stress-65510 clippy incr-unchanged 0.63% 3.17x
match-stress clippy incr-unchanged -0.63% 3.15x
issue-88862 clippy full -0.60% 2.99x
unify-linearly clippy incr-patched: dummy fn -0.59% 2.94x
many-assoc-items clippy incr-full -0.58% 2.88x
ctfe-stress-5 clippy full -0.56% 2.78x
tuple-stress clippy incr-full -0.55% 2.76x
issue-46449 clippy incr-patched: empty 3072 0.55% 2.75x
ucd clippy incr-full -0.55% 2.73x
deeply-nested-multi clippy incr-unchanged -0.54% 2.71x
tuple-stress clippy incr-patched: new row -0.54% 2.69x
issue-88862 clippy incr-full -0.53% 2.64x
many-assoc-items clippy incr-unchanged -0.51% 2.56x
helloworld-tiny clippy full 0.48% 2.42x
coercions clippy incr-patched: add static arr item -0.47% 2.37x
issue-46449 clippy incr-patched: io error 6144 0.46% 2.29x
unify-linearly clippy incr-unchanged 0.44% 2.20x
issue-46449 clippy incr-patched: u32 3072 0.41% 2.06x
many-assoc-items clippy full -0.40% 1.99x
issue-46449 clippy incr-patched: u8 3072 0.39% 1.95x
ucd clippy full -0.38% 1.90x
externs clippy incr-unchanged 0.37% 1.85x
tt-muncher clippy incr-full 0.35% 1.77x
issue-46449 clippy incr-patched: static str 6144 0.34% 1.72x
derive clippy incr-full -0.32% 1.61x
tuple-stress clippy full -0.32% 1.59x

Polish

marramiau miau miau 🎉

meowmeowmeow

Removing unused files

Meow

format

Make lint level minimums parallel

Fix author lint + stop using symbols

Move from symbols to strings + fix things

meow sync

Remove crate-checking code

Prepare for WIP PR opening

m

Turn back to visitor

Add loadbearing lints + Move filtering code to lint_crate

Move lint filter to check_mod

Also handle lint groups + Should work
@blyxyas
Copy link
Member Author

blyxyas commented Sep 11, 2024

I've done some more benchmarking, we have some very good numbers. I think this is ready to start looking into merging
I'm not sure why the artifacts size has changed, only lints should have been impacted.

@michaelwoerister
Copy link
Member

The code still contains sections of commented out code and has some unaddressed review comments. Would you mind doing a clean up pass over everything?

@jieyouxu jieyouxu removed their assignment Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Don't run late lints that are disabled in the entire crate