-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ptr::add/sub: do not claim equivalence with offset(c as isize)
#130229
base: master
Are you sure you want to change the base?
Conversation
Some changes occurred to the CTFE / Miri engine cc @rust-lang/miri The Miri subtree was changed cc @rust-lang/miri |
offset
), and fix gap in Miri UB checks
ecb451b
to
d7c4504
Compare
library/core/src/ptr/const_ptr.rs
Outdated
@@ -355,7 +355,8 @@ impl<T: ?Sized> *const T { | |||
/// | |||
/// If any of the following conditions are violated, the result is Undefined Behavior: | |||
/// | |||
/// * The computed offset, `count * size_of::<T>()` bytes, must not overflow `isize`. | |||
/// * The computed offset, `count * size_of::<T>()` bytes (using unbounded arithmetic), | |||
/// must fit in an `isize`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the most clear and concise way to say "convert count
and size_of::<T>()
from whatever their types are into unbounded mathematical integers, multiply those, and check if the result fits in the value range of isize
"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe flip it around? Could say the same thing as
If
sizeof(T) > 0
, thencount <= isize::MAX / sizeof(T)
.
Alternatively, could say it as something like
usize::saturating_mul(count, size_of::<T>)
fits in an isize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, the docs on offset_from
say
the absolute distance between the pointers, in bytes, computed on mathematical integers (without "wrapping around"), cannot overflow an
isize
so maybe something like this could work
The offset in bytes (
count * size_of::<T>()
), computed on mathematical integers (without "wrapping around"), must fit in anisize
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "mathematical multiplication" is the most intuitive version here -- good catch with the offset_from
docs, I will follow that.
To me this feels like we're changing the docs here to introduce new UB where there was none before. FWIW I'm in favor of this change and see how the new definition is much more useful, but I still feel that we should at least do a crater run (with an assert for the overflow) or something to see how widespread this misuse of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The byte_add
/byte_sub
methods also need to be updated, especially that byte_add
also cannot go "backwards".
This is skirting the edge of clarifying docs vs introducing new UB. "must not overflow I'm not sure such a crater run would be meaningful though, it would also catch all the cases that already were UB before... |
d7c4504
to
0e627de
Compare
I have moved the Miri changes into a separate PR (#130239) so it is not held up by discussions around how to deal with the "kind of a breaking change" aspect of this. I also incorporated the feedback. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, the new wording is clearer.
0e627de
to
916eb13
Compare
… r=compiler-errors miri: fix overflow detection for unsigned pointer offset This is the Miri part of rust-lang#130229. This is already UB in codegen so we better make Miri detect it; updating the docs may take time if we have to follow some approval process, but let's make Miri match reality ASAP. r? `@scottmcm`
It is definitely worth adding. I'm working on it. |
… r=compiler-errors miri: fix overflow detection for unsigned pointer offset This is the Miri part of rust-lang#130229. This is already UB in codegen so we better make Miri detect it; updating the docs may take time if we have to follow some approval process, but let's make Miri match reality ASAP. r? ``@scottmcm``
Rollup merge of rust-lang#130239 - RalfJung:miri-ptr-offset-unsigned, r=compiler-errors miri: fix overflow detection for unsigned pointer offset This is the Miri part of rust-lang#130229. This is already UB in codegen so we better make Miri detect it; updating the docs may take time if we have to follow some approval process, but let's make Miri match reality ASAP. r? ``@scottmcm``
…er-errors miri: fix overflow detection for unsigned pointer offset This is the Miri part of rust-lang/rust#130229. This is already UB in codegen so we better make Miri detect it; updating the docs may take time if we have to follow some approval process, but let's make Miri match reality ASAP. r? ``@scottmcm``
I think this is a good idea. Looking at the docs now I think this "convinience for ... as isize" only adds confusion. |
All right so what's the process here? This is a library API so by default I would assume t-libs-api is responsible. This is fairly directly exposing a language primitive though, which is why I pinged t-lang above. But still, let's nominate t-libs-api -- @rust-lang/libs-api are you okay with this change to the docs for our inbounds pointer arithmetic methods? This can be seen as a breaking change since we previously documented e.g. OTOH, the docs do say:
I would say if It seems unlikely that someone would rely on |
FWIW, I've always interpreted the "convenience" statement as a path-dependent "here's why you'd use this".
|
However, if Of course that's a very unlikely thing to happen... |
Both were added before we realized that we could not actually offset something by more than |
@rfcbot merge |
Team member @m-ou-se has proposed to merge this. The next step is review by the rest of the tagged team members: No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
offset
), and fix gap in Miri UB checksoffset
)
offset
)offset(c as isize)
Looking at #130211 again it seems to me that actually we don't yet have codegen that makes things like |
@RalfJung Yes, that's correct. The backend doesn't make use of this right now, but intends to do so in the future. We'll likely start with LLVM 20, as |
Can we add an ub_check for this? |
Seems like saethlin is on it :)
<#130229 (comment)>
|
The draft PR is #130251 |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
In #110837, the
offset
intrinsic got changed to also allow ausize
offset parameter. The intention is that this will do an unsigned multiplication with the size, and we have UB if that overflows -- and we also have UB if the result is larger thanusize::MAX
, i.e., if a subsequent cast toisize
would wrap.The LLVM backend sets some attributes accordingly.This updates the docs for
add
/sub
to match that intent, in preparation for adjusting codegen to exploit this UB. We use this opportunity to clarify what the exact requirements are: we compute the offset using mathematical multiplication (so it's no problem to have anisize * usize
multiplication, we just multiply integers), and the result must fit in anisize
.Cc @rust-lang/opsem @nikic
#130239 updates Miri to detect this UB.
sub
still has some cases of UB not reflected in the underlying intrinsic semantics (and Miri does not catch): when we subtractusize::MAX
, then after casting toisize
that's just-1
so we end up adding one unit without noticing any UB, but actually the offset we gave does not fit in anisize
. Miri will currently still not complain for such cases:However, the LLVM IR we generate here also is UB-free. This is "just" library UB but not language UB.
Cc @saethlin; might be worth adding precondition checks against overflow on
offset
/add
/sub
?Fixes #130211