-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC 3349 precursors #120329
RFC 3349 precursors #120329
Commits on Jan 25, 2024
-
Avoid useless checking in
from_token_lit
.The parser already does a check-only unescaping which catches all errors. So the checking done in `from_token_lit` never hits. But literals causing warnings can still occur in `from_token_lit`. So the commit changes `str-escape.rs` to use byte string literals and C string literals as well, to give better coverage and ensure the new assertions in `from_token_lit` are correct.
Configuration menu - View commit details
-
Copy full SHA for 314dbc7 - Browse repository at this point
Copy the full SHA 314dbc7View commit details -
The `CString` handling code is erroneously identical to the `ByteString` handling code.
Configuration menu - View commit details
-
Copy full SHA for 4b4bdb5 - Browse repository at this point
Copy the full SHA 4b4bdb5View commit details -
Use
from
instead ofinto
in unescaping code.The `T` type in these functions took me some time to understand, and I find the explicit `T` in the use of `from` makes the code easier to read, as does the `u8` annotation in `scan_escape`.
Configuration menu - View commit details
-
Copy full SHA for ef1e222 - Browse repository at this point
Copy the full SHA ef1e222View commit details -
- Rename it as `MixedUnit`, because it will soon be used in more than just C string literals. - Change the `Byte` variant to `HighByte` and use it only for `\x80`..`\xff` cases. This fixes the old inexactness where ASCII chars could be encoded with either `Byte` or `Char`. - Add useful comments. - Remove `is_ascii`, in favour of `u8::is_ascii`.
Configuration menu - View commit details
-
Copy full SHA for a1c0721 - Browse repository at this point
Copy the full SHA a1c0721View commit details -
Rename and invert sense of
Mode
predicates.I find it easier if they describe what's allowed, rather than what's forbidden. Also, consistent naming makes them easier to understand.
Configuration menu - View commit details
-
Copy full SHA for 5e5aa6d - Browse repository at this point
Copy the full SHA 5e5aa6dView commit details -
Rename the unescaping functions.
`unescape_literal` becomes `unescape_unicode`, and `unescape_c_string` becomes `unescape_mixed`. Because rfc3349 will mean that C string literals will no longer be the only mixed utf8 literals.
Configuration menu - View commit details
-
Copy full SHA for 86f371e - Browse repository at this point
Copy the full SHA 86f371eView commit details -
Use
unescape_unicode
for raw C string literals.They can't contain `\x` escapes, which means they can't contain high bytes, which means we can used `unescape_unicode` instead of `unescape_mixed` to unescape them. This avoids unnecessary used of `MixedUnit`.
Configuration menu - View commit details
-
Copy full SHA for 6be2e56 - Browse repository at this point
Copy the full SHA 6be2e56View commit details