Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proc_macro/bridge: stop using a remote object handle for proc_macro Punct and Group #98188

Merged
merged 3 commits into from
Jun 28, 2022

Conversation

mystor
Copy link
Contributor

@mystor mystor commented Jun 17, 2022

This is the third part of #86822, split off as requested in #86822 (review). This patch transforms the Punct and Group types into structs serialized over IPC rather than handles, making them more efficient to create and manipulate from within proc-macros.

@rust-highfive

This comment was marked as off-topic.

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jun 17, 2022
@rust-highfive
Copy link
Collaborator

r? @petrochenkov

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 17, 2022
@mystor
Copy link
Contributor Author

mystor commented Jun 17, 2022

r? @eddyb

@bors
Copy link
Contributor

bors commented Jun 18, 2022

☔ The latest upstream changes (presumably #98186) made this pull request unmergeable. Please resolve the merge conflicts.

@mystor mystor changed the title Stop using a remote object handle for proc_macro Punct and Group proc_macro/bridge: stop using a remote object handle for proc_macro Punct and Group Jun 18, 2022
@mystor mystor force-pushed the fast_group_punct branch 2 times, most recently from 2dd6d7d to 51b2749 Compare June 26, 2022 17:07
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 26, 2022
proc_macro/bridge: cache static spans in proc_macro's client thread-local state

This is the second part of rust-lang#86822, split off as requested in rust-lang#86822 (review). This patch removes the RPC calls required for the very common operations of `Span::call_site()`, `Span::def_site()` and `Span::mixed_site()`.

Some notes:

This part is one of the ones I don't love as a final solution from a design standpoint, because I don't like how the spans are serialized immediately at macro invocation. I think a more elegant solution might've been to reserve special IDs for `call_site`, `def_site`, and `mixed_site` at compile time (either starting at 1 or from `u32::MAX`) and making reading a Span handle automatically map these IDs to the relevant values, rather than doing extra serialization.

This would also have an advantage for potential future work to allow `proc_macro` to operate more independently from the compiler (e.g. to reduce the necessity of `proc-macro2`), as methods like `Span::call_site()` could be made to function without access to the compiler backend.

That was unfortunately tricky to do at the time, as this was the first part I wrote of the patches. After the later part (rust-lang#98188, rust-lang#98189), the other uses of `InternedStore` are removed meaning that a custom serialization strategy for `Span` is easier to implement.

If we want to go that path, we'll still need the majority of the work to split the bridge object and introduce the `Context` trait for free methods, and it will be easier to do after `Span` is the only user of `InternedStore` (after rust-lang#98189).
This greatly reduces round-trips to fetch relevant extra information about the
token in proc macro code, and avoids RPC messages to create Punct tokens.
This greatly reduces round-trips to fetch relevant extra information about the
token in proc macro code, and avoids RPC messages to create Group tokens.
@mystor mystor marked this pull request as ready for review June 27, 2022 02:27
@eddyb
Copy link
Member

eddyb commented Jun 27, 2022

Before I forget (please don't force-push while this is ongoing):

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 27, 2022
@bors
Copy link
Contributor

bors commented Jun 27, 2022

⌛ Trying commit f28dfdf with merge 0b6ba9689c4b846a039c5ba15f8fcde0251749ed...

Copy link
Member

@eddyb eddyb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, mostly nits as usual (reminder not to force push until perf run is finished, I think bors gets confused easily).

library/proc_macro/src/bridge/mod.rs Outdated Show resolved Hide resolved
Comment on lines +961 to +967
const LEGAL_CHARS: &[char] = &[
'=', '<', '>', '!', '~', '+', '-', '*', '/', '%', '^', '&', '|', '@', '.', ',', ';',
':', '#', '$', '?', '\'',
];
if !LEGAL_CHARS.contains(&ch) {
panic!("unsupported character `{:?}`", ch);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't something like this be a match? That way, LLVM can optimize it to some arithmetic and bit twiddling?

(Oops, just checked and this is what it takes: https://godbolt.org/z/vP13eKjsW - I guess that's pretty nasty)

If you are going to use an array search, can you do an ASCII check first and make it a bytestring literal? So that it at least can hit some kind of memchr specialization or w/e.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that's an easy change. The current code is literally verbatim cut-pasted from the code which used to be in proc_macro_server.rs:

const LEGAL_CHARS: &[char] = &[
'=', '<', '>', '!', '~', '+', '-', '*', '/', '%', '^', '&', '|', '@', '.', ',', ';',
':', '#', '$', '?', '\'',
];
if !LEGAL_CHARS.contains(&ch) {
panic!("unsupported character `{:?}`", ch)
}
.

FWIW the current code appears to actually compile to a somewhat efficient jump table (https://godbolt.org/z/P9d5366YT), and sloppily switching it to use &[u8] instead does appear to make it use memchr (https://godbolt.org/z/EfvaaMbbh)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing codegen seems like the nicest of the options (I wouldn't be too surprised if it performs better than the memchr variant), so I'm inclined to stick with it.

compiler/rustc_expand/src/proc_macro_server.rs Outdated Show resolved Hide resolved
library/proc_macro/src/bridge/mod.rs Outdated Show resolved Hide resolved
compiler/rustc_expand/src/proc_macro_server.rs Outdated Show resolved Hide resolved
compiler/rustc_expand/src/proc_macro_server.rs Outdated Show resolved Hide resolved
compiler/rustc_expand/src/proc_macro_server.rs Outdated Show resolved Hide resolved
Interpolated(nt) => {
let stream = TokenStream::from_nonterminal_ast(&nt);
if crate::base::nt_pretty_printing_compatibility_hack(&nt, rustc.sess()) {
trees.extend(Self::from_internal((stream, rustc)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this Self::from_internal didn't go through a weird method trait, I would definitely suggest passing &mut Vec here (would help with any future experiments like using SmallVec<[_; 32]> instead of Vec, as there would be no expensive moves).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I'm not sure this codepath is worth optimizing given it's only for that pretty-printing compatibility hack, meaning it only impacts enums with the name ProceduralMasqueradeDummyType.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, though in the SmallVec<[_; 32]> case, I'm worried the rarely-used codepath would still cause worse cache behavior for the whole function.

@bors
Copy link
Contributor

bors commented Jun 27, 2022

☀️ Try build successful - checks-actions
Build commit: 0b6ba9689c4b846a039c5ba15f8fcde0251749ed (0b6ba9689c4b846a039c5ba15f8fcde0251749ed)

@rust-timer
Copy link
Collaborator

Queued 0b6ba9689c4b846a039c5ba15f8fcde0251749ed with parent bd2e51a, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (0b6ba9689c4b846a039c5ba15f8fcde0251749ed): comparison url.

Instruction count

  • Primary benchmarks: 🎉 relevant improvements found
  • Secondary benchmarks: 🎉 relevant improvements found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
-0.5% -1.4% 17
Improvements 🎉
(secondary)
-1.5% -5.1% 17
All 😿🎉 (primary) -0.5% -1.4% 17

Max RSS (memory usage)

Results
  • Primary benchmarks: no relevant changes found
  • Secondary benchmarks: 😿 relevant regression found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
2.3% 2.3% 1
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
N/A N/A 0
All 😿🎉 (primary) N/A N/A 0

Cycles

Results
  • Primary benchmarks: no relevant changes found
  • Secondary benchmarks: mixed results
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
3.0% 3.0% 1
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
-3.3% -3.5% 3
All 😿🎉 (primary) N/A N/A 0

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 27, 2022
Copy link
Member

@eddyb eddyb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me with at least the comment fixed (and the PR description updated)

compiler/rustc_expand/src/proc_macro_server.rs Outdated Show resolved Hide resolved
compiler/rustc_expand/src/proc_macro_server.rs Outdated Show resolved Hide resolved
compiler/rustc_expand/src/proc_macro_server.rs Outdated Show resolved Hide resolved
longer names for RPC generics and reduced dependency on macros in the server.
@rustbot

This comment was marked as off-topic.

@eddyb
Copy link
Member

eddyb commented Jun 28, 2022

@bors r+

@bors
Copy link
Contributor

bors commented Jun 28, 2022

📌 Commit 64a7d57 has been approved by eddyb

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 28, 2022
@bors
Copy link
Contributor

bors commented Jun 28, 2022

⌛ Testing commit 64a7d57 with merge 94e9374...

@bors
Copy link
Contributor

bors commented Jun 28, 2022

☀️ Test successful - checks-actions
Approved by: eddyb
Pushing 94e9374 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jun 28, 2022
@bors bors merged commit 94e9374 into rust-lang:master Jun 28, 2022
@rustbot rustbot added this to the 1.64.0 milestone Jun 28, 2022
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (94e9374): comparison url.

Instruction count

  • Primary benchmarks: 🎉 relevant improvements found
  • Secondary benchmarks: 🎉 relevant improvements found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
-0.5% -1.4% 15
Improvements 🎉
(secondary)
-1.6% -5.0% 16
All 😿🎉 (primary) -0.5% -1.4% 15

Max RSS (memory usage)

Results
  • Primary benchmarks: no relevant changes found
  • Secondary benchmarks: mixed results
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
1.0% 1.0% 1
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
-3.2% -3.4% 2
All 😿🎉 (primary) N/A N/A 0

Cycles

Results
  • Primary benchmarks: no relevant changes found
  • Secondary benchmarks: 🎉 relevant improvements found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
2.7% 2.7% 1
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
-3.4% -4.6% 6
All 😿🎉 (primary) N/A N/A 0

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Jun 29, 2022
Unbreak stage1 tests via ignore-stage1 in `proc-macro/invalid-punct-ident-1.rs`.

rust-lang#98188 broke `./x.py test --stage 1` (which I thought we ran in PR CI, cc `@rust-lang/infra)` i.e. the default `./x.py test` in dev checkouts, as the panic in `src/test/ui/proc-macro/invalid-punct-ident-1.rs` moved from the server (`rustc`) to the client (proc macro), and that means it's now affected by rust-lang#59998.

I made the test look like `src/test/ui-fulldeps/issue-76270-panic-in-libproc-macro.rs` tho I'm a bit confused why that one is in `src/test/ui-fulldeps`, it should still work in `src/test/ui`, no? (cc `@Aaron1011)`
mystor added a commit to mystor/rust that referenced this pull request Jul 29, 2022
…idge

This is done by having the crossbeam dependency inserted into the
proc_macro server code from the server side, to avoid adding a
dependency to proc_macro.

In addition, this introduces a -Z command-line option which will switch
rustc to run proc-macros using this cross-thread executor. With the
changes to the bridge in rust-lang#98186, rust-lang#98187, rust-lang#98188 and rust-lang#98189, the
performance of the executor should be much closer to same-thread
execution.

In local testing, the crossbeam executor was substantially more
performant than either of the two existing CrossThread strategies, so
they have been removed to keep things simple.
bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 30, 2022
proc_macro: use crossbeam channels for the proc_macro cross-thread bridge

This is done by having the crossbeam dependency inserted into the `proc_macro` server code from the server side, to avoid adding a dependency to `proc_macro`.

In addition, this introduces a -Z command-line option which will switch rustc to run proc-macros using this cross-thread executor. With the changes to the bridge in rust-lang#98186, rust-lang#98187, rust-lang#98188 and rust-lang#98189, the performance of the executor should be much closer to same-thread execution.

In local testing, the crossbeam executor was substantially more performant than either of the two existing `CrossThread` strategies, so they have been removed to keep things simple.

r? `@eddyb`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants