Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instruct LLVM that binary_search returns a valid index #81354

Merged
merged 2 commits into from
Mar 28, 2021

Conversation

SkiFire13
Copy link
Contributor

This allows removing bound checks when the return value of binary_search is used to index into the slice it was call on. I also added a codegen test for this, not sure if it's the right thing to do (I didn't find anything on the dev guide), but it felt so.

@rust-highfive
Copy link
Collaborator

r? @Mark-Simulacrum

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 24, 2021
@Aaron1011
Copy link
Member

This seems like the wrong way of fixing the issue. get_unchecked requires that the index be in bounds - if LLVM can't remove the bounds check, then get_unchecked itself should be fixed instead of trying to work around the issue at a particular callsite.

@Mark-Simulacrum
Copy link
Member

I assume that the get unchecked doesn't emit a bounds check, but for some reason isn't propagating the information to outside the function, whereas the assume does - note that the test case checks for the lack of problems indexing with the returned usize.

I do agree that it may be worth considering the assume being inlined into get unchecked instead.

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 24, 2021
@SkiFire13
Copy link
Contributor Author

This seems like the wrong way of fixing the issue. get_unchecked requires that the index be in bounds - if LLVM can't remove the bounds check, then get_unchecked itself should be fixed instead of trying to work around the issue at a particular callsite.

get_unchecked requires the index to be in bounds, however that is a check the programmer has to do, otherwise the result is UB. This check was already done in the implementation of binary_search, in fact there aren't bound checks in its implementation. However if that, index after being returned, is used to safely index the slice on which binary_search was called, you'll get a bound check that isn't optimized away. The problem is that this bound check is useless because the index has been proved by us to be valid, but LLVM has no way to know that. That assume tells LLVM just that and allows that bound check to be optimized.

I do agree that it may be worth considering the assume being inlined into get unchecked instead.

I didn't thought of that, I based this on the implementation of Iterator::position for slice::Iter which however doesn't use get_unchecked. Still, it looks like a good idea to me.

@Mark-Simulacrum
Copy link
Member

Well, in theory, LLVM could see that the access through get unchecked guarantees that the index is valid, but I imagine it would be quite difficult to demonstrate that.

Cc @nagisa @cuviper @nikic - IIRC, in the past assume in common methods like get_unchecked has actually been rather unhelpful, but I don't recall if we've specifically tried it with get_unchecked.

@nagisa
Copy link
Member

nagisa commented Jan 24, 2021

in the past assume in common methods like get_unchecked has actually been rather unhelpful

That's correct – in the past any attempts to add assume would only help with very handpicked samples and not others. Furthermore assume often has a nasty side-effect of preventing other optimisations from occurring, so we have refrained from adding assumes to the standard library implementation where the intent is, like done here, to improve certain very specific examples.

@SkiFire13
Copy link
Contributor Author

Well, in theory, LLVM could see that the access through get unchecked guarantees that the index is valid, but I imagine it would be quite difficult to demonstrate that.

Aren't bound checks basically ifs? I don't think they be elided just by knowing a certain memory access is valid.

That's correct – in the past any attempts to add assume would only help with very handpicked samples and not others. Furthermore assume often has a nasty side-effect of preventing other optimisations from occurring, so we have refrained from adding assumes to the standard library implementation where the intent is, like done here, to improve certain very specific examples.

Is this about adding assume to general purpose methods, like get_unchecked, or to specific ones like position or binary_search? I guess with general methods it's pretty rare that that condition is later used, however with more specific ones I would expect it to be used more often.

@the8472
Copy link
Member

the8472 commented Jan 25, 2021

Could the search be implemented by narrowing a slice and then doing a ptr::offset_from to the original slice and returning the offset? Perhaps that'll make it easier for llvm to see that it is derived from the original reference.

@SkiFire13
Copy link
Contributor Author

I don't think that LLVM really cares if it's derived from the original reference. The bound check tests for the index being less than the length, which is just another number for LLVM.

@the8472
Copy link
Member

the8472 commented Jan 25, 2021

This works as an alternative to assume.

        if base < ary.len() {
            if cmp == Equal { Ok(base) } else { Err(base + (cmp == Less) as usize) }
        } else {
            unsafe { std::hint::unreachable_unchecked() }
        }

https://rust.godbolt.org/z/9o9oq8

Edit: nevermind, this also boils down to assume in the IR.

@ojeda
Copy link
Contributor

ojeda commented Jan 31, 2021

This seems related to what I see when using the newly introduced unwrap_unchecked() at #81383.

I have filled https://bugs.llvm.org/show_bug.cgi?id=48975 with the IR reduction and an analysis of why manually integrating the function (e.g. via a macro) helps sometimes. If solved, it may solve also this one as well as others like:

// NOTE: for `I: FusedIterator`, we assume that the iterator is always `Some`.
// Implementing this as a directly-expanded macro helps codegen performance.
macro_rules! unchecked {
($self:ident) => {
match $self {
Fuse { iter: Some(iter) } => iter,
// SAFETY: the specialized iterator never sets `None`
Fuse { iter: None } => unsafe { intrinsics::unreachable() },
}
};
}

@SkiFire13
Copy link
Contributor Author

Oops, I pushed a commit to the wrong branch

@JohnCSimon JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Feb 22, 2021
@bors
Copy link
Contributor

bors commented Mar 5, 2021

☔ The latest upstream changes (presumably #74024) made this pull request unmergeable. Please resolve the merge conflicts.

@JohnCSimon
Copy link
Member

Triage:
@SkiFire13 - thanks for resolving, setting this to waiting on review.

@rustbot label: +S-waiting-on-review -S-waiting-on-author

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 22, 2021
@Mark-Simulacrum
Copy link
Member

Going to re-assign to @nagisa who can better judge whether we should do this, but past experience hints to me that this is likely to hurt more than help in the common case.

@nagisa
Copy link
Member

nagisa commented Mar 28, 2021

#80836 comes to mind as a counterexample to usefulness of this kind of optimisation. Basically, it really matters In this particular case what the implementation of the len method is. Given this is a slice the len method is going to be fairly trivial (a field read). Thus an assume like this one does have a fair chance of improving other code that does not directly rely on the len() method, but otherwise interacts with the metadata portion of the slice's fat pointer.

@bors r+ with that in mind, but happy to revert if anybody pinpoints this to cause any kind of regression in their code.

@bors
Copy link
Contributor

bors commented Mar 28, 2021

📌 Commit c9d04c2 has been approved by nagisa

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 28, 2021
@Mark-Simulacrum
Copy link
Member

@bors rollup=never

@bors
Copy link
Contributor

bors commented Mar 28, 2021

⌛ Testing commit c9d04c2 with merge 1df2056...

@bors
Copy link
Contributor

bors commented Mar 28, 2021

☀️ Test successful - checks-actions
Approved by: nagisa
Pushing 1df2056 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Mar 28, 2021
@bors bors merged commit 1df2056 into rust-lang:master Mar 28, 2021
@rustbot rustbot added this to the 1.53.0 milestone Mar 28, 2021
@SkiFire13 SkiFire13 deleted the binary-search-assume branch March 28, 2021 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants