Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Short Macro Invocation Syntax: m!123 and m!"abc" #3267

Closed
wants to merge 3 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 103 additions & 0 deletions text/0000-macro-shorthand.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
- Feature Name: `macro_shorthand`
- Start Date: 2022-05-18
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)

# Summary
[summary]: #summary

This is a proposal for `m!literal` macro invocation syntax,
for macros that feel like literals:

```rust
let num = bignum!12345678901234567890123456789012345678901234567890;
let msg = f!"n = {num}";
let file = w!"C:\\Thing";
```

# Motivation
[motivation]: #motivation

In the Rust 2021 edition we reserved all prefixes for literals, so we can give them a meaning in the future.
However, many ideas for specific literal prefixes (e.g. for wide strings or bignums) are domain- or crate-specific,
and should arguably not be a builtin part of the language itself.

By making `m!literal` a way to invoke a macro, we get a syntax that's just as convenient and light-weight as built-in prefixes,
but through a mechanism that allows them to be user-defined, without any extra language features necessary to define them.

For example:

- Windows crates could provide wide strings using `w!"C:\\"`
- An arbitrary precision number crate could provide `bignum!12345678901234567890123456789012345678901234567890`.
- Those who want "f-strings" can then simply do `use std::format as f;` and then use `f!"{a} {b}"`.

The difference with `f!("{a} {b}")`, `w!("C:\\")` and `bignum!(123...890)` is small, but significant.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

Macros can be invoked using `m!(..)`, `m![..]`, `m!{..}` or `m!..` syntax.
In the last case, the argument must be a single literal, such as `m!123`, `m!2.1`, `m!"abc"`, or `m!'x'`.
From the perspective of a macro definition, these are all identical, and a macro cannot differentiate between the different call syntaxes.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

The macro invocation syntax is changed from

```
MacroInvocation :
SimplePath ! DelimTokenTree

MacroInvocationSemi :
SimplePath ! ( TokenTree* ) ;
| SimplePath ! [ TokenTree* ] ;
| SimplePath ! { TokenTree* }
```

to

```
MacroInvocation :
SimplePath ! Literal
Copy link
Member

@eddyb eddyb May 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this "Literal" unambiguous enough to distinguish between these two situations?

  • "literal" token, aka proc_macro::Literal (string/character/numeric literals)
  • "literal" (e.g. expression) grammar, aka rustc_ast::Literal aka $lit:literal macro inputs
    • besides what literal tokens support, this also includes false and true
    • also it does more validation (suffixes, presumably string escapes, forcing integers into u128, etc.)

I would assume the former (esp. given the mention of m!identifier later in the RFC, where arguably m!false/m!true would fit), but $lit:literal being the latter muddles the waters a bit sadly.

This is confusing enough that the reference is wrongly mentioning false/true on stable under "tokens" (was fixed since by rust-lang/reference#1189).

(Thanks to @solson for bringing up the potential ambiguity wrt bool literals - I would've naively assumed it was a non-concern at first)

Copy link
Member Author

@m-ou-se m-ou-se May 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first one.

$lit:literal doesn't accept very large integers, which is important for use cases like bignum!123. (Playground.)

Allowing json!true and json!false might be reasonable, but then we also need to have a discussion about json!null, js!NaN, py!True, and so on, which would all need the identifier or path grammer. So I'd like to leave that discussion for a future RFC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I think we could leave false/true to the identifier case, at most I'd add a note to the RFC that Literal in this case refers to something different from "literal expressions"/$lit:literal syntax.

Can probably link to other parts of the reference, but I'm not sure what exactly is the most relevant (I got confused just now trying to follow it, though a lot of that was looking at the pre-rust-lang/reference#1189 version).

Copy link
Member

@joshtriplett joshtriplett May 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$lit:literal doesn't accept very large integers, which is important for use cases like bignum!123

I wonder if we could fix that?

Not a blocker for this RFC, but we could allow arbitrary-length integer literals to get fed into macros, and only check if they fit when the resulting token stream from the macro gets parsed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$x:tt accepts arbitrary length integers just fine. (See the playground link above.) It's just that non-tt things like $x:literal no longer represent the raw tokens but instead a portion of already processed/parsed AST.

| SimplePath ! DelimTokenTree

MacroInvocationSemi :
SimplePath ! Literal ;
| SimplePath ! ( TokenTree* ) ;
| SimplePath ! [ TokenTree* ] ;
| SimplePath ! { TokenTree* }
```

# Drawbacks
[drawbacks]: #drawbacks

- It allows for confusing syntax like `vec!1` for `vec![1]`.
- Counter-argument: we already allow `vec!(1)`, `println! { "" }` and `thread_local![]`, which also don't cause any problems.
(Rustfmt even corrects the first one to use square brackets instead.)

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

- Expect those macros to be used with `m!(..)` syntax.
- That's already possible today, but plenty of people are asking for things
like `f""` or `w""`, which shows that `f!("")` does not suffice.
- Have a separate mechanism for defining custom prefixes or suffixes.
- E.g. `10.4_cm`, which is possible in C++ through `operator""`.
- This requires a seprate mechanism, which complicates the language significantly.
- Require macros to declare when they can be called using this syntax.

m-ou-se marked this conversation as resolved.
Show resolved Hide resolved
# Unresolved questions
[unresolved-questions]: #unresolved-questions

- Should we allow `m!b"abc"` and `m!b'x'`? (I think yes.)
- Should we allow `m!r"..."`? (I think yes.)
- Should we allow `m!123i64`? (I think yes.)
- Should we allow `m!-123`? (I'm unsure. Technically the `-` is a separate token. Could be a future addition.)

# Future possibilities
[future-possibilities]: #future-possibilities

In the future, we could consider extending this syntax in a backwards compatible way by allowing
slightly more kinds of arguments to be used without brackets, such as `m!-123` or `m!identifier`, or even `m!|| { .. }` or `m!struct X {}`.
Copy link
Member

@joshtriplett joshtriplett May 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m!identifier seems unambiguous and easy to add, though also less well-motivated.

m!-123 could, for now, at least have a rustfix-applicable suggestion telling the user to use -m!123 instead.

Copy link
Member Author

@m-ou-se m-ou-se May 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m!identifier would possibly make it hard or impossible to allow m!p::a::t::h or m!thing.member, so I figured that might be good to leave for a later discussion. (Not saying that we should allow either of that. Just saying that allowing m!identifier might block other future possibilities.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I would put this exact explanation in the RFC text itself, since I had the exact same question and you pretty thoroughly convinced me why we shouldn't right now in a single sentence.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

telling the user to use -m!123 instead.

More likely m!(-123), since it isn't necessarily guaranteed that the macro is commutative with -

(That might be a bad idea, which is why I'm not proposing it as part of this RFC.)