Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-evaluate Hash{Set,Map} vs FxHash{Set,Map} once #69152 lands #69153

Closed
nnethercote opened this issue Feb 14, 2020 · 13 comments
Closed

Re-evaluate Hash{Set,Map} vs FxHash{Set,Map} once #69152 lands #69153

nnethercote opened this issue Feb 14, 2020 · 13 comments
Labels
I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-compiler-performance Working group: Compiler Performance

Comments

@nnethercote
Copy link
Contributor

rustc uses FxHash{Set,Map} everywhere rather than Hash{Set,Map}, because the DefaultHasher used by Hash{Set,Map} is slow.

But once #69152 lands, DefaultHasher will be a lot faster when hashing integers, which is a common case; in one microbenchmark I saw a ~2.5x speed-up. Combine that with the fact that FxHasher is a lower-quality hasher and so tends to result in more collisions, and the default hash tables might be faster. (On a different microbenchmark I saw that HashSet<u32> was a little bit faster than FxHashSet<u32>.)

We should evaluate this, probably by replacing every FxHash{Set,Map} with Hash{Set,Map}. (It keeps things simpler if we exclusively used one or the other, rather than a mix.)

I briefly tried to do this, but we have a lint that produces this message if you try to use Hash{Set,Map}: "error: Prefer FxHashSet over HashSet, it has better performance". I couldn't work out how to disable it.

cc @rust-lang/wg-compiler-performance
cc @cbreeden
cc @Amanieu

@Zoxc
Copy link
Contributor

Zoxc commented Feb 14, 2020

You could fork the rustc-hash crate, make it use DefaultHasher and then use [replace] in the workspace Cargo.toml to try it out.

@CryZe
Copy link
Contributor

CryZe commented Feb 14, 2020

It may also make sense to evaluate ahash, the hash function that hashbrown switched to due to fxhash's poor hash quality.

@jonas-schievink jonas-schievink added I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 14, 2020
@eddyb
Copy link
Member

eddyb commented Feb 14, 2020

Didn't we also use a custom hasher to bypass randomness?
If we rely on hashbrown we must make sure its compile-time randomness is disabled.

@nnethercote
Copy link
Contributor Author

nnethercote commented Feb 17, 2020

Note to self: x.py build --warnings=warn makes the warning ignorable. Thanks to @Mark-Simulacrum for the tip.

@nnethercote
Copy link
Contributor Author

I measured this change locally, on top of the DefaultHasher improvements in #69152. (In the end I only had to change three lines in librustc_data_structures/fx.rs.) The results were terrible. Here are just the check results, which are representative.

deeply-nested-check
        avg: 63.6%      min: 24.8%      max: 84.7%
wf-projection-stress-65510-che...
        avg: 47.8%      min: 30.0%      max: 57.8%
clap-rs-check
        avg: 43.7%      min: 30.5%      max: 55.7%
wg-grammar-check
        avg: 46.0%      min: 30.8%      max: 54.3%
packed-simd-check
        avg: 50.2%      min: 46.9%      max: 52.2%
serde-check
        avg: 44.1%      min: 32.4%      max: 51.3%
regression-31157-check
        avg: 40.9%      min: 31.4%      max: 49.5%
futures-check
        avg: 38.1%      min: 30.1%      max: 47.3%
unify-linearly-check
        avg: 34.0%      min: 25.1%      max: 43.6%
syn-check
        avg: 38.7%      min: 36.4%      max: 42.8%
piston-image-check
        avg: 34.5%      min: 29.2%      max: 42.0%
tokio-webpush-simple-check
        avg: 35.3%      min: 24.6%      max: 41.0%
regex-check
        avg: 34.0%      min: 30.4%      max: 40.9%
ctfe-stress-4-check
        avg: 36.7%?     min: 29.2%?     max: 40.8%?
ripgrep-check
        avg: 33.3%      min: 27.9%      max: 40.0%
unused-warnings-check
        avg: 33.7%      min: 29.3%      max: 39.7%
webrender-check
        avg: 33.6%      min: 28.9%      max: 38.8%
hyper-2-check
        avg: 32.8%      min: 24.2%      max: 38.6%
webrender-wrench-check
        avg: 30.8%      min: 24.2%      max: 37.9%
cargo-check
        avg: 27.6%      min: 18.3%      max: 37.4%
issue-46449-check
        avg: 34.6%      min: 29.0%      max: 37.2%
deep-vector-check
        avg: 30.7%      min: 11.6%      max: 37.0%
coercions-check
        avg: 35.7%?     min: 33.7%?     max: 36.9%?
encoding-check
        avg: 28.8%      min: 23.5%      max: 36.2%
cranelift-codegen-check
        avg: 31.2%      min: 26.5%      max: 36.0%
trait-stress-check
        avg: 32.3%      min: 27.8%      max: 34.8%
serde-serde_derive-check
        avg: 28.5%      min: 23.0%      max: 34.4%
html5ever-check
        avg: 27.5%      min: 21.4%      max: 34.1%
ucd-check
        avg: 28.1%      min: 20.8%      max: 33.5%
keccak-check
        avg: 19.9%      min: 11.8%      max: 31.8%
tuple-stress-check
        avg: 28.4%      min: 21.1%      max: 31.6%
inflate-check
        avg: 17.9%      min: 9.7%       max: 30.7%
await-call-tree-check
        avg: 26.6%      min: 23.2%      max: 30.2%
helloworld-check
        avg: 23.3%      min: 16.9%      max: 26.0%
unicode_normalization-check
        avg: 18.9%      min: 16.9%      max: 20.1%
token-stream-stress-check
        avg: 5.9%       min: 3.9%       max: 6.9%

Every single run of every single benchmark regressed, by 3.9% to 84.7%!

I had a microbenchmark where HashMap (with the improvements from #69152) was slightly better than FxHashMap, presumably because the benefit of fewer collisions outweighed the cost of the slower hasher. But that microbenchmark was using integers as keys, which is the best case for DefaultHasher. In the compiler I imagine a lot of the hash tables have more complicated keys containing multiple integers and or strings.

Oh well, at least it confirms that FxHasher is helping a lot.

@eddyb
Copy link
Member

eddyb commented Feb 20, 2020

@nnethercote Just to confirm, when you're talking about DefaultHasher, it's still SipHash, right?

I had seen #69153 (comment) and got myself confused into thinking your results were for aHash.

I'm not surprised SipHash is still a slowdown, given its intended usecase (DoS protection), I don't think we can get away with a hash function "that good" unless we can heavily rely on hardware acceleration (which, ironically, might make actual cryptographic hashes faster; I guess the hardware acceleration might be reusable for something weaker than full-blown AES, but I have no idea).

@nagisa
Copy link
Member

nagisa commented Feb 20, 2020 via email

@Amanieu
Copy link
Member

Amanieu commented Feb 20, 2020

AHash actually has a fallback hash which is almost as fast as FxHash but avoids the issues with poor distribution. This is the main reason why I chose it as the default hash function for hashbrown.

@nnethercote
Copy link
Contributor Author

@nnethercote Just to confirm, when you're talking about DefaultHasher, it's still SipHash, right?

It's SipHasher13 (as normal) with the improvements from #69152 applied. I haven't looked at AHash at all.

@nnethercote
Copy link
Contributor Author

nnethercote commented Feb 21, 2020

It may also make sense to evaluate ahash, the hash function that hashbrown switched to due to fxhash's poor hash quality.

I just tried ahash. For some reason I got the fallback hash rather than the AES hardware one, even though I'm on a reasonably new x86-64 box that has AES capability (according to the info from /proc/cpuinfo).

Fallback ahash is slightly worse than fxhash in both instruction counts (mostly 1-2%) and cycles (mostly 1-4%).

ahash is also not deterministic across different builds, because it uses const_random! when initializing hasher state. This could cause extra noise in perf runs, which would be bad.

@eddyb
Copy link
Member

eddyb commented Feb 21, 2020

ahash is also not deterministic across different builds, because it uses const_random! when initializing hasher state. This could cause extra noise in perf runs, which would be bad.

I brought that up earlier, and it's worse than that because it breaks deterministic builds. But you can turn it off (look at the feature list in aHash's Cargo.toml).

However, being slightly worse than fxhash probably means it's not worth spending more time on it (unless there is some inefficiency in the fallback path).

@nnethercote
Copy link
Contributor Author

nnethercote commented Feb 21, 2020

But you can turn it off (look at the feature list in aHash's Cargo.toml).

I tried that but it caused compile errors because the you don't get a Default impl for AHasher without the compile-time-rng feature.

@nagisa
Copy link
Member

nagisa commented Feb 21, 2020

aHash not having reproducible output between machines make it unsuitable choice for rustc anyway. If we want to have any resemblance of support for distributed builds, that is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-compiler-performance Working group: Compiler Performance
Projects
None yet
Development

No branches or pull requests

7 participants