Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hashable should take type index into account #131

Open
daniel-j-h opened this issue Nov 29, 2016 · 4 comments
Open

Hashable should take type index into account #131

daniel-j-h opened this issue Nov 29, 2016 · 4 comments

Comments

@daniel-j-h
Copy link
Contributor

Hashing a variant should take the type index / .which() into account in addition to the underlying hash

  • // hashable iff underlying types are hashable
    namespace std {
    template <typename... Types>
    struct hash< ::mapbox::util::variant<Types...>> {
    std::size_t operator()(const ::mapbox::util::variant<Types...>& v) const noexcept
    {
    return ::mapbox::util::apply_visitor(::mapbox::util::detail::hasher{}, v);
    }
    };
    }
  • // hashing visitor
    struct hasher
    {
    template <typename T>
    std::size_t operator()(const T& hashable) const
    {
    return std::hash<T>{}(hashable);
    }
    };

Why? Because of the edge case where the underlying hash is the same but the type index is not.

@tomilov
Copy link

tomilov commented Nov 30, 2016

Why is it the problem? There are hash collisions anyways. After matching the hashes of two values any algorithm which uses hashes should to compare for equality. Variants with different active alternatives are not equal. Runtime overhead imposed by hash-combining of the value's hash and of the the (excessive) hash of its index is permanent otherwise. It is hardly desirable.

@tomilov
Copy link

tomilov commented Nov 30, 2016

I sure the probability of hash(A) == hash(B) is the same as probability of combine(hash(A), hash(index(A))) == combine(hash(B), hash(index(B))). Say decltype(A) == int and decltype(B) == char, hash is identity for integral types and hash combiner is xor. Then for variant< A, B >'s comparison variant{A{2}} vs variant{B{1}}: 2 ^ 1 == 1 ^ 2. And it is the truth for all the combinations of successive even and odd underlying values. As you can see the probability is equally the same (w/o loss of generality for hashes with avalanche effect and non-linear combiners).

@daniel-j-h
Copy link
Contributor Author

I open this ticket mainly because I read the updates on

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0088r3.html

hash<variant> can now return different values than hash (and it should - presumably it should take the index() into account).

and

http://en.cppreference.com/w/cpp/utility/variant/hash

Unlike std::hashstd::optional, hash of a variant does not typically equal the hash of the contained value; this makes it possible to distinguish std::variant<int, int> holding the same value as different alternatives.

the Boost.Variant impl. does the same, too.

@tomilov
Copy link

tomilov commented Nov 30, 2016

Anyways hash combining every time for any alternative type is permanent runtime overhead. But std::variant< T, T > is hardly useful. Why there should be different hash values for first and second occurence of a T in the alternative type list? What is the rationale behind this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants