Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACP: key_mut method for BTreeMap entries #356

Open
clarfonthey opened this issue Mar 18, 2024 · 2 comments
Open

ACP: key_mut method for BTreeMap entries #356

clarfonthey opened this issue Mar 18, 2024 · 2 comments

Comments

@clarfonthey
Copy link

Proposal

Problem statement

This is an ACP for a version of rust-lang/rust#112896 that supports both safe (checked) and unsafe (unchecked) mutations in one API.

Motivating examples or use cases

Essentially: we would like to be able to have both a safe and an unsafe version, but would like to make it easy for a user to directly mutate a key rather than having to pass in a closure.

This would be similar to the BinaryHeap::peek_mut wrapper which fulfills a similar task.

Solution sketch

For now, unless I can see another alternative, we'll have to have three types for each entry: one VacantKeyMut, one OccupiedKeyMut, and one KeyMut that is an enum of the two. I haven't investigated the internals of how BTreeMap works well enough to be able to determine if we actually can get away with a single KeyMut API, so, I think the ACP should assume that there would be three of them.

Similar to BinaryHeap::peek_mut, we can obtain the handle using a mutable borrow:

impl<...> Entry<...> {
    pub fn key_mut(&mut self) -> KeyMut<'_, ...> { ... }
}

And, also similarly to BinaryHeap::peek_mut, we would have to perform a bit of trickery to ensure that the handle works correctly when leaked. So, this is effectively how it would go:

  1. When DerefMut is called, part of the map or entry would be removed and stored temporarily in the handle. If the handle is leaked before this happens, no part of the map is leaked, but if it's leaked after this happens, some of the map will be leaked. The result is that regardless of what happens, the state of the map is still consistent. Additionally, any future operations on the entry will likely be part of the leaked map and not the actual map.
  2. When Drop is called, if the handle contains part of the map, the key is verified and the map is restored into the correct state based upon the new value of the key. This means that a vacant entry would have to recompute the new location that the key would be inserted into, and an occupied entry would have to move the existing node to the appropriate position in the map.

However, there's an important third step which is unique to this handle, where this entire validation can be skipped with a method that consumes the handle and restores the map without performing any extra checks:

impl<...> KeyMut<...> {
    pub fn skip_key_validation(self) { ... }
}

Currently, all existing APIs for key_mut use unsafe, but after reading more closely, I don't think that this method should be unsafe, since it's already stated that interior mutability can cause the ordering to break, and this isn't unsafe, just a logic error. We could make it unsafe to indicate that you shouldn't do this, but at least for now, I'd like to propose making it safe and then we can talk about it.


A final point of discussion here is whether this is acceptable from the unchecked point of view, since there are additional operations added compared to the plain key_mut method. I believe that these can easily be optimised out, since the order of operations would be roughly:

  1. spot_in_entry = Some(mem::take(&mut spot_in_map))
  2. *key = new_key
  3. if let Some(thing) = spot_in_entry { spot_in_map = mem::take(spot_in_entry) }

Which could reorder to:

  1. *key = new_key
  2. spot_in_entry = Some(mem::take(&mut spot_in_map))
  3. if let Some(thing) = spot_in_entry { spot_in_map = mem::take(&mut spot_in_entry) }

Remove the branch:

  1. *key = new_key
  2. spot_in_entry = Some(mem::take(&mut spot_in_map))
  3. spot_in_map = mem::take(&mut spot_in_entry) }

Notice the double-swap, and then remove it. Of course, we'd have to be careful about how the code is written to help facilitate this, but I think that at worst, we'd probably end up with a constant number of extra writes, which is still substantially faster than O(log(N)) comparisons.

Alternatives

A few here:

  1. We don't add any key mutation APIs. I disagree with this approach, since these APIs were one of the key motivating factors for the cursors API in BTreeMap cursors #141.
  2. We make the skip_key_validation method unsafe. Again, I disagree because this can't actually cause UB, but this is a valid alternative.
  3. We go with the previous plan, which is to add a separate key_mut_unchecked method offering &mut K and a key_mut method which accepts a closure. I'm not a fan of the closure-based API, and BinaryHeap::peek_mut does provide precedent here, but this is an alternative.

Links and related work

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

  • We think this problem seems worth solving, and the standard library might be the right place to solve it.
  • We think that this probably doesn't belong in the standard library.

Second, if there's a concrete solution:

  • We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
  • We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.
@programmerjake
Copy link
Member

I know it was mentioned somewhere else, but one possible reason that unsafe would still be desired is that allows unsafe code to depend on BTreeMap<TrustedType, T> being sorted, duplicate free, etc. if it can unsafely trust TrustedType (e.g. if TrustedType is really i32)

@clarfonthey
Copy link
Author

clarfonthey commented May 15, 2024

I think that's a reasonable argument, but it feels like an unprecedented way to define unsafety in the standard library. The weird reliance on this particular method being unsafe to uphold a safety contract when other related methods that also must uphold that contract are not unsafe (particularly, the Ord implementation), feels just wrong.

Additionally, it denies the opportunity to use the method in code bases that wants to avoid unsafe code, since it cannot actually trigger unsafety on its own and would otherwise be fine to use in these code bases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants