Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforce unescape functions to operate over UTF-8 and fix serde bugs #421

Merged
merged 8 commits into from
Jul 15, 2022

Conversation

Mingun
Copy link
Collaborator

@Mingun Mingun commented Jul 15, 2022

They are always operate over UTF-8 input, but that was not reflected in types.

This PR also fixes a bug in the serde deserializer: the order of unescape and decode operations was swapped. Although we are still going to abandon decoding on demand, it is still need to mention this fix in the changelog.

Also, because of new encoding policy, deserializing of bytes content using serde now disallowed. This is anyway works incorrectly, because XML is not able to store binary data, they are always should be encoded in some way. Now this is enforced by deserializer and allows us to decode all input at once ans work with UTF-8 representation internally.

…oding" feature is not enabled

When feature is not enabled, content is always UTF-8 encoded and it is safe to call that functions
All binary data in XML stored in string encoding, using HEX, Base64 and similar encodings

Test `deserialize_bytes` removed as it is already covered by trivial::struct_::text::byte_buf test
Only deserialization of externally tagged enums is affected, because only there
`escaped` is set to `true`.

I do not add tests for that, because enum deserialization will be reviewed
in the near future where tests will be added
@Mingun Mingun added bug enhancement serde Issues related to mapping from Rust types to XML encoding Issues related to support of various encodings of the XML documents labels Jul 15, 2022
@Mingun Mingun requested a review from dralley July 15, 2022 18:54
@codecov-commenter
Copy link

Codecov Report

Merging #421 (7578119) into master (72e11d1) will decrease coverage by 0.15%.
The diff coverage is 63.08%.

@@            Coverage Diff             @@
##           master     #421      +/-   ##
==========================================
- Coverage   49.65%   49.49%   -0.16%     
==========================================
  Files          22       22              
  Lines       13854    13856       +2     
==========================================
- Hits         6879     6858      -21     
- Misses       6975     6998      +23     
Flag Coverage Δ
unittests 49.49% <63.08%> (-0.16%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
benches/macrobenches.rs 0.00% <0.00%> (ø)
benches/microbenches.rs 0.00% <0.00%> (ø)
examples/custom_entities.rs 0.00% <0.00%> (ø)
src/events/attributes.rs 94.09% <0.00%> (-0.33%) ⬇️
src/lib.rs 12.33% <0.00%> (+0.06%) ⬆️
src/reader.rs 60.13% <0.00%> (ø)
src/de/escape.rs 21.05% <33.33%> (-2.44%) ⬇️
src/events/mod.rs 68.66% <61.53%> (-3.63%) ⬇️
src/de/mod.rs 77.77% <80.00%> (+0.47%) ⬆️
src/de/map.rs 72.55% <100.00%> (-0.30%) ⬇️
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 72e11d1...7578119. Read the comment docs.

@dralley dralley merged commit d34b779 into tafia:master Jul 15, 2022
@Mingun Mingun deleted the unescape branch July 16, 2022 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug encoding Issues related to support of various encodings of the XML documents enhancement serde Issues related to mapping from Rust types to XML
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants