Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Other proc macros can break the soundness of our custom derives #388

Open
12 tasks
joshlf opened this issue Sep 17, 2023 · 6 comments
Open
12 tasks

Other proc macros can break the soundness of our custom derives #388

joshlf opened this issue Sep 17, 2023 · 6 comments
Labels
bug Something isn't working compatibility-breaking Changes that are (likely to be) breaking

Comments

@joshlf
Copy link
Member

joshlf commented Sep 17, 2023

This issue tracks soundness holes in our custom derives introduced by other proc macros. Tasks:

  • Enumerate all soundness holes
  • Confirm that the list is complete (prove there are no other soundness holes)
  • Enumerate things which feel like they could be soundness holes, but aren't
    • Write tests that confirm we're robust against these
  • Fix the following known soundness holes:
    • Proc macro attributes which run after our custom derives can modify the type definition "out from under us" (proof of concept). Possible solutions:
      • Update the reference to guarantee the order of evaluation of attributes, and then fail compilation if our custom derives are not in the position which is evaluated last. (Currently, order of evaluation is lexical, but it's not guaranteed to remain this way.)
      • If attributes are stripped out after evaluation so that any subsequently-executed custom derives don't see them in their input token stream, we could simply fail compilation if we see any unrecognized attributes. This would not require the order of evaluation to be guaranteed in order to guarantee soundness, but it would mean that future changes to the order of evaluation could break zerocopy, creating a stability hazard.
  • Confirm that the following hypothesized soundness holes are real:
    • #[cfg_attr(foo, repr(...)] might cause our derive to either fail to notice a repr or notice a repr which is removed in some compilations
    • A future language addition could introduce a built-in attribute which we don't recognize, and which changes the layout or bit validity of a type (one possible solution to this would just be to fail if we see unrecognized attributes).
    • A future language addition could allow custom derives to modify the types they annotate, making them equivalent to proc macro attributes in terms of ability to mess with our custom derives.
  • Write tests to confirm that the following aren't soundness holes:
    • A user could import an attribute macro named repr, and trick our custom derives into thinking it's the "real" repr attribute. This seems not to be a hole for two reasons:
      • When you write #[repr(...)], rustc currently complains that the name is ambiguous, and fails compilation. We need to confirm that this behavior is guaranteed.
      • If we are able to require that our derive is evaluated after all attributes are evaluated and we are guaranteed that all already-executed attributes will not be present in the token stream which is fed to our custom derive, then a repr attribute which is actually a proc macro will be evaluated first, and the token stream passed to our custom derive will not include it - we'll only see "real" repr attributes.

Misc Notes

  • It might be the case that attribute evaluation order has to be guaranteed (whether that was intended in the past or not) because it's observable. In particular, the widely-used technique of a custom derive defining its own attributes (e.g., #[serde(...)]) seems to depend on attribute evaluation order.
@djkoloski
Copy link
Member

I verified that attribute macros can change the definition of an item after derives on it have already run:

use transform::insert_field;

trait A {}
trait B {}

#[derive(Debug)]
#[insert_field(baz = "u32")]
struct Foo {
    bar: i32,
}

fn main() {
    println!("{:?}", Foo {
        bar: 10,
        baz: 100,
    });
}

prints:

Foo { bar: 10 }

Which demonstrates that even though Foo had an extra baz field inserted via attribute proc-macro, the item gets passed to the Debug derive before that modification takes place. Expanding the source with cargo expand also demonstrates the issue. For completeness, the source of the insert_field proc-macro is:

use proc_macro2::{Ident, Span};
use quote::quote;
use syn::{parse_macro_input, DeriveInput, Meta, Lit, Expr, Visibility, token::Colon, Data, Fields, Field, parse_quote};

#[proc_macro_attribute]
pub fn insert_field(
    attr: proc_macro::TokenStream,
    item: proc_macro::TokenStream,
) -> proc_macro::TokenStream {
    let meta = parse_macro_input!(attr as Meta);
    let Meta::NameValue(name_value) = meta else { panic!("expected name-value") };
    let name = name_value.path.get_ident().expect("name to be an identifier");
    let Expr::Lit(lit) = &name_value.value else { panic!("expected literal") };
    let Lit::Str(ty_name) = &lit.lit else { panic!("expected string") };
    let ident = Ident::new(&ty_name.value(), Span::call_site());

    let mut item = parse_macro_input!(item as DeriveInput);
    let Data::Struct(struct_) = &mut item.data else { panic!("expected struct") };
    let Fields::Named(fields) = &mut struct_.fields else { panic!("expected fields") };
    fields.named.push(Field {
        attrs: Vec::new(),
        vis: Visibility::Inherited,
        mutability: syn::FieldMutability::None,
        ident: Some(name.clone()),
        colon_token: Some(Colon { spans: [Span::call_site()]}),
        ty: parse_quote!(#ident),
    });

    let output = quote! {
        #item
    };

    output.into()
}

@jswrenn
Copy link
Collaborator

jswrenn commented Sep 27, 2023

The status quo is that proc macros evaluate from the outside in. We should confirm this is specified, and do what we can to mitigate it.

We could defend against @djkoloski's example by also emitting code that destructures the annotated type, thus ensuring that there would be a compile error if the definition changed.

However, imagine a proc-macro attribute that only removed (or tampered) with #[repr(C)] from annotated definitions, but left fields unchanged. For this, the only mitigation I can see is forbidding the presence of unknown attributes.

@joshlf
Copy link
Member Author

joshlf commented Sep 27, 2023

IIUC, guaranteeing evaluation order should be enough to mitigate the "unknown attribute" problem: We just ensure that we're placed in a location that evaluates after any attribute macros.

That still leaves open the question of shadowing attributes by name - e.g., introducing a proc macro attribute called repr that our custom derives mistakenly think is the built-in repr attribute.

Another thought: Does the token stream emitted by a proc macro attribute include the proc macro attribute annotation? If not, we should expect that any proc macro attributes which execute before us will no longer be present in the token stream that we see. This should mean that a proc macro attribute which shadows repr would be removed by the time it gets to us, and we'd only see "real" repr attributes.

@jswrenn
Copy link
Collaborator

jswrenn commented Sep 27, 2023

I've confirmed that it's not possible to shadow repr attributes. Doing so produces an error:

`repr` is ambiguous
ambiguous because of a name conflict with a builtin attribute
use `crate::repr` to refer to this attribute macro unambiguously

@joshlf
Copy link
Member Author

joshlf commented Sep 27, 2023

Phew

@joshlf joshlf changed the title Could attributes cause unsoundness in our derives? Other proc macros can break the soundness of our custom derives Sep 27, 2023
@joshlf
Copy link
Member Author

joshlf commented Jan 31, 2024

cc @reinerp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working compatibility-breaking Changes that are (likely to be) breaking
Projects
None yet
Development

No branches or pull requests

3 participants