From 69a6f705a26f7599f92834df735809806f75989f Mon Sep 17 00:00:00 2001 From: Isaac van Bakel Date: Mon, 24 Jul 2023 14:57:15 +0200 Subject: [PATCH 1/3] Clarify UB around immutability & mutation I personally found this description of UB confusing, since the use of "reached" suggests that UB only happens for read bytes, and the definition of immutability is not given, allowing for multiple interpretations: does the "data" have to be immutable from the first read? From the creation of the reference? Between reads from the immutable accessor, but not otherwise? etc. This clarifies the actual UB conditions, based on this Zulip interaction: https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/What.20exactly.20are.20.22immutable.22.20and.20.22reached.22.20in.20shared.20ref.20UB.3F and this reference discussion: https://github.com/rust-lang/reference/issues/1227 in two ways: * The definition of "data" is clarified to be stated in terms of bytes, in a way that should avoid ambiguity about which bytes are considered. Based on the GH issue, this clarification should also allow for use of a `*mut` pointer through a shared reference, which is not in itself UB. Based on the Zulip issue, the definition includes padding bytes, which may be surprising. * The definition of immutability & mutation for a set of bytes is clarified to mean forbidding *all* non-0-byte writes. --- src/behavior-considered-undefined.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/src/behavior-considered-undefined.md b/src/behavior-considered-undefined.md index 28cde7a94..020205cb1 100644 --- a/src/behavior-considered-undefined.md +++ b/src/behavior-considered-undefined.md @@ -42,9 +42,12 @@ code. All this also applies when values of these types are passed in a (nested) field of a compound type, but not behind pointer indirections. -* Mutating immutable data. All data inside a [`const`] item is immutable. Moreover, all - data reached through a shared reference or data owned by an immutable binding - is immutable, unless that data is contained within an [`UnsafeCell`]. +* Mutating immutable data. All bytes inside a [`const`] item are immutable. + Moreover, the bytes of a value pointed to by a shared reference, or bytes owned by an immutable binding are immutable, unless those bytes are part of an [`UnsafeCell`]. + Immutability also affects bytes which are not reachable from safe code, such as padding; it also affects uninitialized bytes. + + A mutation is any write of more than 0 bytes which overlaps with any of the relevant bytes. + Writes which do not modify the byte contents (i.e. writes of a byte's value to that byte) are still mutations. * Invoking undefined behavior via compiler intrinsics. * Executing code compiled with platform features that the current platform does not support (see [`target_feature`]), *except* if the platform explicitly documents this to be safe. From 70886e3c49e03aa559ae678791eeebcf4480a2db Mon Sep 17 00:00:00 2001 From: Isaac van Bakel Date: Mon, 24 Jul 2023 14:57:17 +0200 Subject: [PATCH 2/3] Define immutability UB in terms of bytes This is part of the feedback on rust-lang/reference#1385. Ralf made the point that the immutability definition could be restated solely in terms of bytes, which has the added benefit of no longer requiring the note on padding (since it's a natural consequence of the byte version.) The new wording for shared references also clarifies the case of mutable references behind shared ones, and reintroduces some of the transitivity property that I removed in my previous commit. The wording is separate from that for immutable bindings, since those don't have transitive immutability. This also bumps the definition of bytes pointed to by references and pointers into its own subsection, so that it can be linked to by the UB definition, to avoid duplication. Co-authored-by: Ralf Jung --- src/behavior-considered-undefined.md | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/src/behavior-considered-undefined.md b/src/behavior-considered-undefined.md index 020205cb1..4ffa703c1 100644 --- a/src/behavior-considered-undefined.md +++ b/src/behavior-considered-undefined.md @@ -42,12 +42,14 @@ code. All this also applies when values of these types are passed in a (nested) field of a compound type, but not behind pointer indirections. -* Mutating immutable data. All bytes inside a [`const`] item are immutable. - Moreover, the bytes of a value pointed to by a shared reference, or bytes owned by an immutable binding are immutable, unless those bytes are part of an [`UnsafeCell`]. - Immutability also affects bytes which are not reachable from safe code, such as padding; it also affects uninitialized bytes. +* Mutating immutable bytes. All bytes inside a [`const`] item are immutable. + The bytes owned by an immutable binding are immutable, unless those bytes are part of an [`UnsafeCell`]. + + Moreover, the bytes [pointed to] by a shared reference, including transitively through other references (both shared and mutable) and `Box`es, are immutable: transitivity includes those references stored in fields of compound types. A mutation is any write of more than 0 bytes which overlaps with any of the relevant bytes. - Writes which do not modify the byte contents (i.e. writes of a byte's value to that byte) are still mutations. + + > **Note**: Writes which do not modify the byte contents (i.e. writes of a byte's value to that byte) are still mutations. * Invoking undefined behavior via compiler intrinsics. * Executing code compiled with platform features that the current platform does not support (see [`target_feature`]), *except* if the platform explicitly documents this to be safe. @@ -94,13 +96,16 @@ reading uninitialized memory is permitted are inside `union`s and in "padding" > vice versa, undefined behavior in Rust can cause adverse affects on code > executed by any FFI calls to other languages. +### Pointed-to bytes + +The span of bytes a pointer or reference "points to" is determined by the pointer value and the size of the pointee type (using `size_of_val`). + ### Dangling pointers [dangling]: #dangling-pointers A reference/pointer is "dangling" if it is null or not all of the bytes it -points to are part of the same live allocation (so in particular they all have to be -part of *some* allocation). The span of bytes it points to is determined by the -pointer value and the size of the pointee type (using `size_of_val`). +[points to] are part of the same live allocation (so in particular they all have to be +part of *some* allocation). If the size is 0, then the pointer must either point inside of a live allocation (including pointing just after the last byte of the allocation), or it must be @@ -124,3 +129,5 @@ must never exceed `isize::MAX`. [dereference expression]: expressions/operator-expr.md#the-dereference-operator [place expression context]: expressions.md#place-expressions-and-value-expressions [rules]: inline-assembly.md#rules-for-inline-assembly +[points to]: #pointed-to-bytes +[pointed to]: #pointed-to-bytes From f12eaec52214b20287687924e7ef469f5d82c2a8 Mon Sep 17 00:00:00 2001 From: Isaac van Bakel Date: Tue, 25 Jul 2023 17:26:46 +0200 Subject: [PATCH 3/3] Style fixups in immutability UB These changes should preserve the meaning of the contents. Co-authored-by: Ralf Jung --- src/behavior-considered-undefined.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/src/behavior-considered-undefined.md b/src/behavior-considered-undefined.md index 4ffa703c1..9d1732d07 100644 --- a/src/behavior-considered-undefined.md +++ b/src/behavior-considered-undefined.md @@ -45,11 +45,9 @@ code. * Mutating immutable bytes. All bytes inside a [`const`] item are immutable. The bytes owned by an immutable binding are immutable, unless those bytes are part of an [`UnsafeCell`]. - Moreover, the bytes [pointed to] by a shared reference, including transitively through other references (both shared and mutable) and `Box`es, are immutable: transitivity includes those references stored in fields of compound types. + Moreover, the bytes [pointed to] by a shared reference, including transitively through other references (both shared and mutable) and `Box`es, are immutable; transitivity includes those references stored in fields of compound types. - A mutation is any write of more than 0 bytes which overlaps with any of the relevant bytes. - - > **Note**: Writes which do not modify the byte contents (i.e. writes of a byte's value to that byte) are still mutations. + A mutation is any write of more than 0 bytes which overlaps with any of the relevant bytes (even if that write does not change the memory contents). * Invoking undefined behavior via compiler intrinsics. * Executing code compiled with platform features that the current platform does not support (see [`target_feature`]), *except* if the platform explicitly documents this to be safe.