Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify documentation on HashMap::capacity() #24591

Closed
tomjakubowski opened this issue Apr 19, 2015 · 8 comments
Closed

Clarify documentation on HashMap::capacity() #24591

tomjakubowski opened this issue Apr 19, 2015 · 8 comments

Comments

@tomjakubowski
Copy link
Contributor

The documentation for HashMap::capacity() says:

Returns the number of elements the map can hold without reallocating.

However, a HashMap will sometimes store more elements than its capacity indicates. See this example (Playpen link):

use std::collections::HashMap;

fn main() {
    let mut map = HashMap::new();
    for i in 0..1000 {
        map.insert(i, i);
        let (cap, len) = (map.capacity(), map.len());
        assert!(cap >= len, "cap {} len {}", cap, len);
    }
}
@sfackler
Copy link
Member

Seems to me like this is more of a bug in capacity rather than a docs issue. cc @gankro

@Gankra
Copy link
Contributor

Gankra commented Apr 19, 2015

This is intended behaviour. It's a lower-bound.

@tomjakubowski
Copy link
Contributor Author

@gankro maybe the wrong venue for this question, but as a user of HashMap is there a way to know whether an operation resizes a HashMap + moves its allocation, either in advance or after the operation has completed? More precisely, I guess I'd like to know whether any value in the HashMap has been relocated.

If values are only moved when a resize happens I suppose I could just take a pointer to some value before the operation, and then locate it again and compare its address after the operation.

@pnkfelix
Copy link
Member

@tomjakubowski I think you cannot predict whether a value will be moved, regardless of whether the hashmap is resized.

In particular, we use robin-hood hashing which can move elements around (essentially as an attempt to move the most frequently accessed elements to an earlier point in the chain of hash-colliding elements).

@Gankra
Copy link
Contributor

Gankra commented Apr 19, 2015

robin-hood doesn't do "most frequent" but rather "furthest from its ideal". But yeah all the elements are in general slowly drifting to higher addresses. You'd need a chaining hashmap to try to guarantee position properties. Even then an optimized chaining map wouldn't try to preserve that property.

If you want to know if a reallocation happened just check if the capacity has increased. It's a lower-bound but it's fairly accurate.

@tomjakubowski
Copy link
Contributor Author

Ah, I see. Thanks for the clarification @gankro and @pnkfelix

@steveklabnik
Copy link
Member

So, this isn't a bug then? Or should we clarify that it's a lower bound?

@Gankra
Copy link
Contributor

Gankra commented Apr 20, 2015

Hmm... I thought I specifically worded the docs to make this more clear (or at least, more debateable). Yes clarifying that for HashMap it is a lower bound seems appropriate. Although it might not even be that if we decide to allow reallocs if degenerate collisions are detected...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants