Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format only modified files #105702

Merged
merged 5 commits into from
Dec 29, 2022
Merged

Format only modified files #105702

merged 5 commits into from
Dec 29, 2022

Conversation

albertlarsan68
Copy link
Member

@albertlarsan68 albertlarsan68 commented Dec 14, 2022

As discussed on #105688, this makes x fmt only format modified files.

Fixes #105688

@rustbot
Copy link
Collaborator

rustbot commented Dec 14, 2022

r? @Mark-Simulacrum

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 14, 2022
@albertlarsan68
Copy link
Member Author

r? @jyn514

@matthiaskrgr
Copy link
Member

🤔 what if I change something in rustfmt.toml and want to apply this change to the repo, will it still be picked up?

@Mark-Simulacrum
Copy link
Member

Mark-Simulacrum commented Dec 14, 2022

x.py fmt takes ~1.6 seconds (2.4 seconds with an empty disk cache) for me. Is it much slower for others?

I'm not sure optimizing a few seconds is worth doing -- most of the time, the expectation is that folks are enabling rustfmt on save in their editor, so running it isn't even necessary (this is the case for me), but even without that it should be quite fast to run it.

In any case, as a separate item, we need to make sure that we're running it against all files on changes to the rustfmt binary (and other such wide-spread changes, like @matthiaskrgr notes).

@matthiaskrgr
Copy link
Member

Also, I think it is common to
git fetch upstream
git checkout workbranch
git rebase upstream/master
x.py fmt

to fix the format differences introduced by the merge/rebase, will rustfmt still recognize files that were changed but committed as modified?

@albertlarsan68
Copy link
Member Author

🤔 what if I change something in rustfmt.toml and want to apply this change to the repo, will it still be picked up?

You can pass any path and the early culling is not applied.


x.py fmt takes ~1.6 seconds (2.4 seconds with an empty disk cache) for me. Is it much slower for others?

It takes me over 30 secs, but with this modification it now only takes a few seconds when no file changed.

I'm not sure optimizing a few seconds is worth doing -- most of the time, the expectation is that folks are enabling rustfmt on save in their editor, so running it isn't even necessary (this is the case for me), but even without that it should be quite fast to run it.

See #105688 for the motivation

In any case, as a separate item, we need to make sure that we're running it against all files on changes to the rustfmt binary (and other such wide-spread changes, like matthiaskrgr notes).

I still need to implement this part


will rustfmt still recognize files that were changed but committed as modified?

This seeks for files modified between the rust-lang/rust's master and the current tree.

@matthiaskrgr
Copy link
Member

For the record x.py fmt (it already runs in parallel) only takes 6 seconds on a 2016 i5-7200U dual core here 🤔 (on the entire repo, but no files actually modified)

@jyn514
Copy link
Member

jyn514 commented Dec 14, 2022

6 seconds sounds about right; for me at least it's too slow to run format-on-save.

If we want to limit the impact, we can make it opt-in and run it only in the pre-push hook, but I would like to eventually turn it on globally or make it opt-out.

@Mark-Simulacrum
Copy link
Member

Mark-Simulacrum commented Dec 14, 2022

for me at least it's too slow to run format-on-save.

What do you mean by this? I don't run x.py fmt to format on save, I run rustfmt directly. x.py fmt is too slow on save, but that's fine; you're not supposed to use it for that. (Or should pass it the current file).

If we want to limit the impact, we can make it opt-in and run it only in the pre-push hook, but I would like to eventually turn it on globally or make it opt-out.

I think making it opt-in is also good. Regardless, please do make sure that CI is not using this optimization, because I don't trust it to not cause us problems down the line.

@chenyukang
Copy link
Member

x fmt will take about 8 seconds in my local, it would be nice to optimize to 1 second.

@chenyukang
Copy link
Member

Benchmark 1: ./x fmt
  Time (mean ± σ):     12.246 s ±  1.504 s    [User: 61.289 s, System: 5.698 s]
  Range (min … max):   10.908 s … 13.830 s    4 runs

😒 , I work in WSL, it seems really slow, maybe need to buy a new device for format-on-save!

@albertlarsan68
Copy link
Member Author

@chenyukang Are those benches made with this branch, or master?

@chenyukang
Copy link
Member

@chenyukang Are those benches made with this branch, or master?

it's master.

@jyn514
Copy link
Member

jyn514 commented Dec 15, 2022

🤔 what if I change something in rustfmt.toml and want to apply this change to the repo, will it still be picked up?

You can pass any path and the early culling is not applied.

That's not quite enough - changes in rustfmt.toml will affect all files in the repository. Ideally the contents of that file would be part of the stamp so we know when they change. But I'm ok with needing to opt in with x fmt --all or something like that; seems ok if people are explicitly changing the settings.

@albertlarsan68
Copy link
Member Author

I think currently you can do x fmt . and it will not apply the culling.

@albertlarsan68
Copy link
Member Author

It doesn't correctly detect files right now, will work on that. @rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 16, 2022
@the8472
Copy link
Member

the8472 commented Dec 17, 2022

If ./x fmt is slow you might want to check if you have any unneeded generated directories (e.g. node_modules or obj) in the rust project.

I work in WSL, it seems really slow, maybe need to buy a new device for format-on-save!

If you're using wsl2 then make sure you have the rust project checked out in a directory mounted as the native linux filesystem (e.g. ext4) as part of the disk image. That should be much faster than the 9p mounts that are used to access the windows filesystem.

You also should check if you have any generated directories other than build or target in your rust project. Running docker CI sometimes generates obj and node_modules for example which increase the time that x spends in traversing the directory tree.

@jyn514
Copy link
Member

jyn514 commented Dec 17, 2022

@the8472 on a Windows 10 laptop with antivirus scanning excluded for the folder and no untracked files other than build/, x fmt takes 8 seconds to execute for me.

@the8472
Copy link
Member

the8472 commented Dec 17, 2022

"no untracked files" can be misleading because x still dives into gitignored directories when looking for git roots.

@Mark-Simulacrum
Copy link
Member

I wonder if switching from using ignore + adding ignored paths to the file listing, we should (when git is available) use git ls-files. That should much more efficiently skip ignored paths, I would expect. (It runs in ~6ms on a rust checkout for me).

That said, at least for me 80% of x.py fmt (which takes ~3 seconds) is in rustfmt itself, so it's not clear that significant improvements from which files we visit will be had. I wonder what the split there is like on non-Linux(?) platforms, maybe we spend much more time in directory traversal. Presumably rustfmt itself is about as fast anywhere, regardless of platform.

@jyn514
Copy link
Member

jyn514 commented Dec 17, 2022

"no untracked files" can be misleading because x still dives into gitignored directories when looking for git roots.

I used git status --ignored to make sure there weren't ignored files.

I wonder if switching from using ignore + adding ignored paths to the file listing, we should (when git is available) use git ls-files. That should much more efficiently skip ignored paths, I would expect. (It runs in ~6ms on a rust checkout for me).

👍 seems useful and simpler than the current approach

That said, at least for me 80% of x.py fmt (which takes ~3 seconds) is in rustfmt itself, so it's not clear that significant improvements from which files we visit will be had. I wonder what the split there is like on non-Linux(?) platforms, maybe we spend much more time in directory traversal. Presumably rustfmt itself is about as fast anywhere, regardless of platform.

How did you measure this? Happy to report what it's like on Windows if you know how.

@jyn514
Copy link
Member

jyn514 commented Dec 17, 2022

That said, at least for me 80% of x.py fmt (which takes ~3 seconds) is in rustfmt itself, so it's not clear that significant improvements from which files we visit will be had.

Wait hold on, this doesn't make a lot of sense to me. Presumably rustfmt runs in about the same time on files that need formatting and those that don't, so visiting less files in the first place should still be a big improvement even if rustfmt's runtime is the limiting factor.

@Mark-Simulacrum
Copy link
Member

Sorry, I meant "how we visit files", not "which".

@Mark-Simulacrum
Copy link
Member

How did you measure this? Happy to report what it's like on Windows if you know how.

I measured by trying to produce a flamegraph (with perf on Linux). Not sure what is available on windows like that.

@albertlarsan68
Copy link
Member Author

albertlarsan68 commented Dec 27, 2022

@the8472 Here is a proof that hyperfine's user/system time is incorrect in this case:

When directly invoking python, the user/system time is the same tiny amount regardless of the work done.

When indirecting through powershell (which is what x does on windows), the user/system time is bigger than direct python, but the same regardless of the work done.

When subtracting the user/system time from the powershell times, we get roughly the same times as their direct counterparts.
So, the user/system times are measuring the powershell startup time and the search for the correct python executable.

Benchmark 1: python3.10 .\x.py
  Time (mean ± σ):      2.820 s ±  0.178 s    [User: 0.008 s, System: 0.037 s]
  Range (min … max):    2.671 s …  3.205 s    10 runs

Benchmark 2: python3.10 .\x.py fmt
  Time (mean ± σ):     16.022 s ±  1.016 s    [User: 0.013 s, System: 0.037 s]
  Range (min … max):   14.598 s … 17.467 s    10 runs

Benchmark 3: python3.10 .\x.py fmt .
  Time (mean ± σ):     67.320 s ± 15.144 s    [User: 0.014 s, System: 0.050 s]
  Range (min … max):   55.116 s … 100.146 s    10 runs

Benchmark 4: x
  Time (mean ± σ):     19.698 s ±  3.657 s    [User: 8.633 s, System: 5.378 s]
  Range (min … max):   16.174 s … 25.800 s    10 runs

Benchmark 5: x fmt
  Time (mean ± σ):     37.601 s ±  9.312 s    [User: 9.065 s, System: 5.418 s]
  Range (min … max):   28.874 s … 60.072 s    10 runs

Benchmark 6: x fmt .
  Time (mean ± σ):     88.858 s ± 18.543 s    [User: 9.158 s, System: 5.678 s]
  Range (min … max):   72.185 s … 131.717 s    10 runs

@the8472
Copy link
Member

the8472 commented Dec 27, 2022

Which hyperfine version are you using? The user time accounting was changed in 1.15, released in september.

@albertlarsan68
Copy link
Member Author

I do use hyperfine v1.15.0

@jyn514
Copy link
Member

jyn514 commented Dec 27, 2022

I think the discussion about benchmarking is important to have but not necessarily relevant to this PR. Formatting only modified files will benefit everyone, not just @albertlarsan68, and I think the impact of a bug is limited enough it's ok to land given that we don't have specific code we think is buggy.

@albertlarsan68 if you add that line in the changelog I mentioned in #105702 (review), I think we can land this :)

@jyn514 jyn514 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 27, 2022
@albertlarsan68
Copy link
Member Author

@rustbot ready
@jyn514 Has the changelog entry a good balance between information quantity and brevity?

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Dec 27, 2022
@jyn514
Copy link
Member

jyn514 commented Dec 28, 2022

Looks great, thanks!

@bors r+ rollup (not tested in CI)

@bors
Copy link
Contributor

bors commented Dec 28, 2022

📌 Commit 0b942a8 has been approved by jyn514

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 28, 2022
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 28, 2022
…iaskrgr

Rollup of 8 pull requests

Successful merges:

 - rust-lang#104402 (Move `ReentrantMutex` to `std::sync`)
 - rust-lang#104493 (available_parallelism: Gracefully handle zero value cfs_period_us)
 - rust-lang#105359 (Make sentinel value configurable in `library/std/src/sys_common/thread_local_key.rs`)
 - rust-lang#105497 (Clarify `catch_unwind` docs about panic hooks)
 - rust-lang#105570 (Properly calculate best failure in macro matching)
 - rust-lang#105702 (Format only modified files)
 - rust-lang#105998 (adjust message on non-unwinding panic)
 - rust-lang#106161 (Iterator::find: link to Iterator::position in docs for discoverability)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 0630677 into rust-lang:master Dec 29, 2022
@rustbot rustbot added this to the 1.68.0 milestone Dec 29, 2022
@albertlarsan68 albertlarsan68 deleted the x-fmt-opt branch December 29, 2022 08:34
jyn514 added a commit to jyn514/rust that referenced this pull request Dec 31, 2022
Dont use `--merge-base` during bootstrap formatting subcommand

I use a development image with Ubuntu 20.04 LTS, which has git 2.25.

Recently, `./x.py test tidy --bless` regressed in rust-lang#105702 because it uses the `--merge-base` option on `diff-index`, which was only introduced in git 2.30 (git/git@0f5a1d4). Luckily, it can be replicated via two calls to `git merge-base` + `git diff-index`, so let's just use that.
jyn514 added a commit to jyn514/rust that referenced this pull request Dec 31, 2022
Dont use `--merge-base` during bootstrap formatting subcommand

I use a development image with Ubuntu 20.04 LTS, which has git 2.25.

Recently, `./x.py test tidy --bless` regressed in rust-lang#105702 because it uses the `--merge-base` option on `diff-index`, which was only introduced in git 2.30 (git/git@0f5a1d4). Luckily, it can be replicated via two calls to `git merge-base` + `git diff-index`, so let's just use that.
compiler-errors added a commit to compiler-errors/rust that referenced this pull request Dec 31, 2022
Dont use `--merge-base` during bootstrap formatting subcommand

I use a development image with Ubuntu 20.04 LTS, which has git 2.25.

Recently, `./x.py test tidy --bless` regressed in rust-lang#105702 because it uses the `--merge-base` option on `diff-index`, which was only introduced in git 2.30 (git/git@0f5a1d4). Luckily, it can be replicated via two calls to `git merge-base` + `git diff-index`, so let's just use that.
jyn514 added a commit to jyn514/rust that referenced this pull request Mar 5, 2023
Previously, `x fmt` would only format modified files, while `x fmt .`
and `x fmt --check` would still look at all files. After this change, `x
fmt --check` only looks at modified files locally.

I feel pretty confident in this change - other than
rust-lang#106261, no one has reported
bugs in `get_modified_rs_files` since it was added in
rust-lang#105702.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Mar 5, 2023
…rsan68

x fmt: Only check modified files locally

Previously, `x fmt` would only format modified files, while `x fmt .` and `x fmt --check` would still look at all files. After this change, `x fmt --check` only looks at modified files locally.

I feel pretty confident in this change - other than rust-lang#106261, no one has reported bugs in `get_modified_rs_files` since it was added in rust-lang#105702.

Combined with the changes in rust-lang#108772, this brings the time for me to run `x t tidy` with a hot FS cache down from 5 to 2 seconds (and moves the majority of the time spent back to `tidy check`, which means it can be sped up more in the future).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

x fmt should only format modified files by default
8 participants