Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use once_cell crate instead of custom data structure #72256

Merged
merged 4 commits into from
May 23, 2020

Conversation

ecstatic-morse
Copy link
Contributor

@ecstatic-morse ecstatic-morse commented May 16, 2020

Internally, we use the Once type for shared data that is initialized exactly once and only read from afterwards. Once uses a parking_lot::Mutex when the parallel compiler is enabled and a RefCell when it is not. This PR switches to the once_cell crate, which also uses a parking_lot::Mutex for its sync version (because we enable the parking_lot feature) but has zero overhead for its unsync one.

This PR adds once_cell to the list of whitelisted dependencies. I think this is acceptable because it is already used in rustc_driver, is owned by a well-known community member (cc @matklad), and has a stable release. cc @rust-lang/compiler

once_cell has a slightly more minimal API than Once, which allows for initialization to be either optimistic (evaluate the initializer and then synchronize) or pessimistic (synchronize and then evaluate the initializer). once_cell's get_or_init is always pessimistic. The optimistic version is only used once in the current master.

r? @Mark-Simulacrum

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 16, 2020
@@ -371,13 +371,13 @@ impl<'sess> OnDiskCache<'sess> {
let pos = index.get(&dep_node_index).cloned()?;

// Initialize `cnum_map` using the value from the thread that finishes the closure first.
self.cnum_map.init_nonlocking_same(|| Self::compute_cnum_map(tcx, &self.prev_cnums[..]));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only use of optimistic initialization strategy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should be replaceable with something like this:

if self.cnum_map.get().is_none() {
    self.cnum_map.set(Self::compute_cnum_map(...)));
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed! Do you see a reason why we don't just lock here though? I believe @Zoxc authored this originally, but they are on hiatus.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT this is called for every single query result we deserialize from disk -- that means likely hundreds of thousands of calls, if not more, and in quite hot code -- I imagine locking may be quite expensive as such. I'm not sure why we're unable to initially the crate number map earlier, though; I would expect there to be a point in between query deserialization starting and having the prev_cnums available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I'll switch to your suggestion and leave a note.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, OnceCell already has a fast path that doesn't take the mutex when the value is already set. The only difference in behavior would be that only a single thread executes compute_cnum_map while the rest get de-scheduled. This seems like it might actually be an improvement over the status quo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh in that case yeah this is fine to leave as is, I wasn't sure if that optimization was built-in yet. I can't imagine it matters whether we wait on a mutex or not ~N times when N is so low (on powerful hardware today, probably only a hundred or so at most).

@matklad
Copy link
Member

matklad commented May 16, 2020

The plan is to move OnceCell into std(rust-lang/rfcs#2788), so another option, instead of adding it to the list of approved crates, is to sneaky copy-paste it into std :D

}

#[inline]
pub fn crate_types_opt(&self) -> Option<&[CrateType]> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can leave this to a future PR but it seems quite suspect if we're using this API, as that presumes we don't really care about the answer?

Copy link
Contributor Author

@ecstatic-morse ecstatic-morse May 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only used here:

/// PIE is potentially more effective than PIC, but can only be used in executables.
/// If all our outputs are executables, then we can relax PIC to PIE when producing object code.
/// If the list of crate types is not yet known we conservatively return `false`.
pub fn all_outputs_are_pic_executables(sess: &Session) -> bool {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at that but wasn't able to quickly tell what all is using it and at what stage of compilation -- it does seem like crate types should be populated by then, but I could be wrong of course :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. I see I forgot to respond to this. I did initially try unwrapping crate_types here, but it wasn't populated at this point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. Glad we at least tried it - it sounds like there may be something that's not quite working somewhere as a result, maybe good to file a follow-up issue. Feel free to assign to me. Certainly not urgent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

src/librustc_middle/mir/predecessors.rs Outdated Show resolved Hide resolved
src/librustc_middle/mir/predecessors.rs Show resolved Hide resolved
@@ -371,13 +371,13 @@ impl<'sess> OnDiskCache<'sess> {
let pos = index.get(&dep_node_index).cloned()?;

// Initialize `cnum_map` using the value from the thread that finishes the closure first.
self.cnum_map.init_nonlocking_same(|| Self::compute_cnum_map(tcx, &self.prev_cnums[..]));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should be replaceable with something like this:

if self.cnum_map.get().is_none() {
    self.cnum_map.set(Self::compute_cnum_map(...)));
}

src/librustc_session/session.rs Outdated Show resolved Hide resolved
self.features.set(features);
match self.features.set(features) {
Ok(()) => {}
Err(_) => panic!("`features` was initialized twice"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use expect here like in init_crate_types.

Copy link
Contributor Author

@ecstatic-morse ecstatic-morse May 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Features doesn't implement Debug, and I suspect a derived one would be quite large. Still want me to do this? I could also just unwrap here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay, I didn't realize that was the reason. Seems fine to leave as-is then

@Mark-Simulacrum
Copy link
Member

once_cell should be fine to add to the whitelist, I'm surprised it isn't in there already to be honest.

@ecstatic-morse
Copy link
Contributor Author

Perf queue is empty at the moment, so I'm going to schedule a run with #[inline] removed and get_or_init used in on_disk_cache. If there's no change, as I suspect will be the case, we can roll this up.

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented May 16, 2020

⌛ Trying commit 20e661519d0cfe798345878ca78c140c781b65da with merge 8ad681519630f9fc92d9a689c38a3e33e26fdccc...

@bors
Copy link
Contributor

bors commented May 16, 2020

☀️ Try build successful - checks-azure
Build commit: 8ad681519630f9fc92d9a689c38a3e33e26fdccc (8ad681519630f9fc92d9a689c38a3e33e26fdccc)

@rust-timer
Copy link
Collaborator

Queued 8ad681519630f9fc92d9a689c38a3e33e26fdccc with parent 6163394, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit 8ad681519630f9fc92d9a689c38a3e33e26fdccc, comparison URL.

@ecstatic-morse
Copy link
Contributor Author

Queue is still empty, so I'm trying another perf run with the Session getters inlined to see if it makes a difference for the keccak/inflate benchmarks, which are slightly slower.

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented May 17, 2020

⌛ Trying commit 683a4af6975ac6ca7ed757596b707372a067ba04 with merge 5c9544c6a827baf278ab73b4d724c96f1941467e...

@bors
Copy link
Contributor

bors commented May 17, 2020

☀️ Try build successful - checks-azure
Build commit: 5c9544c6a827baf278ab73b4d724c96f1941467e (5c9544c6a827baf278ab73b4d724c96f1941467e)

@rust-timer
Copy link
Collaborator

Queued 5c9544c6a827baf278ab73b4d724c96f1941467e with parent 34cce58, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit 5c9544c6a827baf278ab73b4d724c96f1941467e, comparison URL.

@nikomatsakis
Copy link
Contributor

I don't see a problem with adding a dep on once_cell, particularly if there is a plan in motion to add it to std. In addition to the criteria listed by @ecstatic-morse, the licensing is also compatible. =)

@ecstatic-morse
Copy link
Contributor Author

Interestingly, making the "getter" methods eligible for inlining seems to have made instruction counts slightly worse. I was expecting OnceCell to be slightly faster for the single-threaded compiler since it doesn't use RefCell. It's worth noting that rustc_data_structure::try_get is #[inline(always)], whereas OnceCell::get is not annotated but still eligible for inlining since OnceCell is generic. I'll need to look at the generated assembly for rustc to really see what's going on.

Are we comfortable with the perf implications of switching to OnceCell?

@Mark-Simulacrum
Copy link
Member

For sure. Those perf implications are barely a blip AFAICT - I'm not even sure they go beyond noise margins.

@Mark-Simulacrum
Copy link
Member

r=me but would be nice to squash commits down a bit (not necessarily into one, up to you)

@ecstatic-morse
Copy link
Contributor Author

ecstatic-morse commented May 18, 2020

I'm not even sure they go beyond noise margins.

I disagree with this part of your statement. The noise floor for instruction counts of "check" builds is around a tenth of a percent (filter by "-check"). This PR makes full and incr-full check builds of keccak and inflate execute 0.8% and 0.5% more instructions respectively. Of course, instruction counts may not correspond to an increase in execution time, but I think it's at least statistically significant.

That said, I think the switch is still worth it, since once_cell is probably a bit better tested and more widely used.

@Mark-Simulacrum
Copy link
Member

Hm, yeah, I agree that there's at least plausibly non-noise there but it may well be due to external effects -- e.g. different inlining or code layout -- I doubt it's really because of this change in some sense.

It would be great if we had e.g. perf diff results comparing the two but that's largely an unsolved problem to my knowledge at this point.

@ecstatic-morse
Copy link
Contributor Author

Squashed and rebased.

@bors r=Mark-Simulacrum

@bors
Copy link
Contributor

bors commented May 22, 2020

🔒 Merge conflict

This pull request and the master branch diverged in a way that cannot be automatically merged. Please rebase on top of the latest master branch, and let the reviewer approve again.

How do I rebase?

Assuming self is your fork and upstream is this repository, you can resolve the conflict following these steps:

  1. git checkout once-cell (switch to your branch)
  2. git fetch upstream master (retrieve the latest master)
  3. git rebase upstream/master -p (rebase on top of it)
  4. Follow the on-screen instruction to resolve conflicts (check git status if you got lost).
  5. git push self once-cell --force-with-lease (update this PR)

You may also read Git Rebasing to Resolve Conflicts by Drew Blessing for a short tutorial.

Please avoid the "Resolve conflicts" button on GitHub. It uses git merge instead of git rebase which makes the PR commit history more difficult to read.

Sometimes step 4 will complete without asking for resolution. This is usually due to difference between how Cargo.lock conflict is handled during merge and rebase. This is normal, and you should still perform step 5 to update this PR.

Error message
Auto-merging src/librustc_typeck/check/autoderef.rs
Auto-merging src/librustc_trait_selection/traits/select.rs
Auto-merging src/librustc_trait_selection/traits/project.rs
Auto-merging src/librustc_trait_selection/traits/error_reporting/suggestions.rs
Auto-merging src/librustc_session/session.rs
Auto-merging src/librustc_save_analysis/lib.rs
Auto-merging src/librustc_mir/monomorphize/collector.rs
Auto-merging src/librustc_middle/ty/context.rs
Auto-merging src/librustc_metadata/rmeta/encoder.rs
Auto-merging src/librustc_metadata/rmeta/decoder.rs
Auto-merging src/librustc_interface/util.rs
Auto-merging src/librustc_codegen_ssa/base.rs
Auto-merging src/librustc_codegen_ssa/back/write.rs
Auto-merging src/librustc_codegen_ssa/back/linker.rs
Auto-merging src/librustc_codegen_ssa/back/link.rs
Auto-merging src/librustc_codegen_llvm/context.rs
CONFLICT (content): Merge conflict in src/librustc_codegen_llvm/context.rs
Auto-merging Cargo.lock
Automatic merge failed; fix conflicts and then commit the result.

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels May 22, 2020
@bors
Copy link
Contributor

bors commented May 22, 2020

☔ The latest upstream changes (presumably #72464) made this pull request unmergeable. Please resolve the merge conflicts.

@ecstatic-morse
Copy link
Contributor Author

@bors r+

@bors
Copy link
Contributor

bors commented May 22, 2020

📌 Commit 307153e has been approved by ecstatic-morse

@bors
Copy link
Contributor

bors commented May 22, 2020

🌲 The tree is currently closed for pull requests below priority 1000, this pull request will be tested once the tree is reopened

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 22, 2020
@ecstatic-morse
Copy link
Contributor Author

@bors r-
@bors r=Mark-Simulacrum

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels May 22, 2020
@bors
Copy link
Contributor

bors commented May 22, 2020

📌 Commit 307153e has been approved by Mark-Simulacrum

@bors
Copy link
Contributor

bors commented May 22, 2020

🌲 The tree is currently closed for pull requests below priority 1000, this pull request will be tested once the tree is reopened

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 22, 2020
@bors
Copy link
Contributor

bors commented May 23, 2020

⌛ Testing commit 307153e with merge 7f940ef...

@bors
Copy link
Contributor

bors commented May 23, 2020

☀️ Test successful - checks-azure
Approved by: Mark-Simulacrum
Pushing 7f940ef to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label May 23, 2020
@bors bors merged commit 7f940ef into rust-lang:master May 23, 2020
@ecstatic-morse ecstatic-morse deleted the once-cell branch October 6, 2020 01:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants