Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make deterministic_hash host-architecture-independent #38357

Merged
merged 1 commit into from
Dec 16, 2016

Conversation

arielb1
Copy link
Contributor

@arielb1 arielb1 commented Dec 13, 2016

DefPath::deterministic_hash used to call std::hash::Hash, which depends on the current architecture in several ways, which would prevent metadata written on one host architecture from being successfully read on another one.

Use a hasher we control instead.

Fixes #38177.

r? @michaelwoerister

@michaelwoerister
Copy link
Member

I'll take a more detailed look at this tomorrow, but in the meanwhile this might be interesting:

/// When hashing a type this ends up affecting properties like symbol names. We

@michaelwoerister
Copy link
Member

Thanks for the fix, @arielb1!

I have few comments though:

  • Adding the StableHash trait seems a bit ad hoc. We already have something similar here and it overlaps in functionality with the ArchIndependentHasher
  • You are right about your concerns regarding hashing discriminants, but hashing them explicitly doesn't really scale. We have similar issues with the TypeId hash and also HIR. On the other hand, at least at the moment, the value returned by discriminant_value seems to be platform independent. And I don't see any reason why we could not start guaranteeing that it stays that way.

Ideally, we would just use the IchHasher, which is already built for this purpose (i.e. stable hashing). The module would have to be moved out of rustc_incremental though (to rustc_datastructures maybe). You can find a 64 bit usage example here: https://github.com/rust-lang/rust/blob/master/src/librustc_trans/partitioning.rs#L214

In the long run we might really want to have a general purpose StableHash trait that is supported by a custom derive (with more guarantees for discriminants) and that can be used for type-id hashes, incremental compilation hashes, debuginfo type hashes, and other things, if appropriate.

Copy link
Member

@michaelwoerister michaelwoerister left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates!

I'm not completely happy with the fact that StableHasher is bound to the BLAKE2 algorithm, which is rather heavyweight. But for the time being it makes no difference.
If you address the few nits I'm happy to merge.

pub fn write_unsigned_leb128(out: &mut Vec<u8>, start_position: usize, mut value: u64) -> usize {
let mut position = start_position;
#[inline]
pub fn write_unsigned_leb128_to<W>(mut value: u64, mut write: W) -> usize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add documentation on what the requirements for write are? Something like "write is a function that writes a single byte to the output destination at the given position. It is used to write the leb128 value to positions [0..N), where N is the number of bytes the value takes up".

@@ -9,6 +9,7 @@
// except according to those terms.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not move Fingerprint to rustc_datastructures? It's an incremental compilation specific type.


#[inline]
fn write_u8(&mut self, i: u8) {
self.state.write_u8(i);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bytes_hashed is not incremented here.


#[inline]
fn write_i8(&mut self, i: i8) {
self.state.write_i8(i);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

/// and `i64`.
///
/// The same goes for endianess: We always convert multi-byte integers to little
/// endian before hashing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This documentation was appropriate for ArchIndependentHasher but the implementation below uses the leb128 approach to solve the problem.

@alexcrichton
Copy link
Member

Nominating for beta to ensure we don't forget about this if it misses the beta branch. Note that this should only go into 1.15.0 beta, not the current (as of this writing) 1.14.0 beta.

@alexcrichton alexcrichton added the beta-nominated Nominated for backporting to the compiler in the beta channel. label Dec 15, 2016
The standard implementations of Hasher have architecture-dependent
results when hashing integers. This causes problems when the hashes are
stored within metadata - metadata written by one host architecture can't
be read by another.

To fix that, implement an architecture-independent StableHasher and use
it in all places an architecture-independent hasher is needed.

Fixes rust-lang#38177.
@michaelwoerister
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented Dec 15, 2016

📌 Commit e1d4b8f has been approved by michaelwoerister

@bors
Copy link
Contributor

bors commented Dec 16, 2016

⌛ Testing commit e1d4b8f with merge c6d8ab0...

bors added a commit that referenced this pull request Dec 16, 2016
make deterministic_hash host-architecture-independent

`DefPath::deterministic_hash` used to call `std::hash::Hash`, which depends on the current architecture in several ways, which would prevent metadata written on one host architecture from being successfully read on another one.

Use a hasher we control instead.

Fixes #38177.

r? @michaelwoerister
@alexcrichton
Copy link
Member

This made it to beta, de-nominating

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants