Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ElementParser and probably slightly increase performance #754

Merged
merged 9 commits into from
Jun 9, 2024

Conversation

Mingun
Copy link
Collaborator

@Mingun Mingun commented Jun 8, 2024

This is follow-up PR to #753 which polish API of introduced PiParser (which could be used to parse DTD) and add the similar parser to handle elements (which also could be used to parse <!ELEMENT> and <!ATTLIST> DTD tags). Also, this PR reduces amount of duplicated code and probably slightly increases performance as showed by benchmarks (master -- 385a1f8). However, in some tests there is a deviation of > = 5% even on consecutive runs on master, so most likely the acceleration is quite insignificant. escape_test are also quite sensitive to noise, which reaches 15% there.

Full diff:

> critcmp master element-parser
group                                                                  element-parser                         master
-----                                                                  --------------                         ------
NsReader::read_resolved_event_into/trim_text = false                   1.00    398.9±6.30µs        ? ?/sec    1.05    419.6±7.94µs        ? ?/sec
NsReader::read_resolved_event_into/trim_text = true                    1.00    382.1±7.06µs        ? ?/sec    1.06    404.0±7.44µs        ? ?/sec
One event/CData                                                        1.00     56.3±0.97ns        ? ?/sec    1.21     68.1±1.35ns        ? ?/sec
One event/Comment                                                      1.00    141.2±2.52ns        ? ?/sec    1.14    161.4±2.79ns        ? ?/sec
One event/Start                                                        1.00    206.4±3.41ns        ? ?/sec    1.01    209.3±4.15ns        ? ?/sec
attributes/try_get_attribute                                           1.01     94.8±1.64µs        ? ?/sec    1.00     93.9±1.76µs        ? ?/sec
attributes/with_checks = false                                         1.02     53.0±1.02µs        ? ?/sec    1.00     52.2±0.91µs        ? ?/sec
attributes/with_checks = true                                          1.03     88.0±1.66µs        ? ?/sec    1.00     85.8±1.67µs        ? ?/sec
decode_and_parse_document/document.xml                                 1.02     93.2±1.80µs   118.0 MB/sec    1.00     91.7±1.67µs   119.9 MB/sec
decode_and_parse_document/libreoffice_document.fodt                    1.00    345.8±6.24µs   157.9 MB/sec    1.00    345.0±6.41µs   158.3 MB/sec
decode_and_parse_document/linescore.xml                                1.00     22.2±0.43µs   159.1 MB/sec    1.01     22.4±0.42µs   157.8 MB/sec
decode_and_parse_document/players.xml                                  1.00    128.2±2.44µs   113.1 MB/sec    1.01    129.2±2.03µs   112.2 MB/sec
decode_and_parse_document/rpm_filelists.xml                            1.01     69.3±1.21µs   158.4 MB/sec    1.00     68.7±1.28µs   160.0 MB/sec
decode_and_parse_document/rpm_other.xml                                1.03    112.9±2.12µs   196.2 MB/sec    1.00    109.7±2.12µs   201.7 MB/sec
decode_and_parse_document/rpm_primary.xml                              1.01    150.8±2.72µs   134.4 MB/sec    1.00    149.1±3.14µs   136.0 MB/sec
decode_and_parse_document/rpm_primary2.xml                             1.01     49.3±0.99µs   145.5 MB/sec    1.00     48.8±0.90µs   147.1 MB/sec
decode_and_parse_document/sample_1.xml                                 1.01      8.7±0.16µs   126.3 MB/sec    1.00      8.6±0.15µs   127.6 MB/sec
decode_and_parse_document/sample_ns.xml                                1.00      6.4±0.12µs   113.8 MB/sec    1.00      6.4±0.11µs   113.6 MB/sec
decode_and_parse_document/sample_rss.xml                               1.02   610.0±10.55µs   309.2 MB/sec    1.00   599.2±10.85µs   314.7 MB/sec
decode_and_parse_document/test_writer_ident.xml                        1.00     20.9±0.38µs   203.5 MB/sec    1.00     20.8±0.40µs   203.7 MB/sec
decode_and_parse_document_with_namespaces/document.xml                 1.00    140.0±2.47µs    78.5 MB/sec    1.03    144.0±2.80µs    76.3 MB/sec
decode_and_parse_document_with_namespaces/libreoffice_document.fodt    1.00   537.4±10.61µs   101.6 MB/sec    1.03   553.9±10.50µs    98.6 MB/sec
decode_and_parse_document_with_namespaces/linescore.xml                1.00     28.3±0.53µs   124.6 MB/sec    1.02     29.0±0.54µs   121.9 MB/sec
decode_and_parse_document_with_namespaces/players.xml                  1.00    161.9±2.86µs    89.5 MB/sec    1.00    162.2±2.70µs    89.3 MB/sec
decode_and_parse_document_with_namespaces/rpm_filelists.xml            1.00     95.1±1.45µs   115.5 MB/sec    1.07    102.2±1.65µs   107.5 MB/sec
decode_and_parse_document_with_namespaces/rpm_other.xml                1.00    145.7±2.69µs   151.9 MB/sec    1.03    150.3±3.00µs   147.3 MB/sec
decode_and_parse_document_with_namespaces/rpm_primary.xml              1.00    204.7±3.66µs    99.0 MB/sec    1.05    214.1±3.97µs    94.7 MB/sec
decode_and_parse_document_with_namespaces/rpm_primary2.xml             1.00     66.9±1.58µs   107.2 MB/sec    1.05     70.1±1.20µs   102.3 MB/sec
decode_and_parse_document_with_namespaces/sample_1.xml                 1.00     11.5±0.22µs    95.3 MB/sec    1.01     11.7±0.22µs    94.2 MB/sec
decode_and_parse_document_with_namespaces/sample_ns.xml                1.00      9.3±0.16µs    78.0 MB/sec    1.04      9.6±0.18µs    75.3 MB/sec
decode_and_parse_document_with_namespaces/sample_rss.xml               1.00   854.7±13.16µs   220.7 MB/sec    1.03   882.0±14.76µs   213.8 MB/sec
decode_and_parse_document_with_namespaces/test_writer_ident.xml        1.00     31.1±0.59µs   136.4 MB/sec    1.02     31.7±0.55µs   133.7 MB/sec
escape_text/escaped_chars_long                                         1.42  1806.4±34.20ns        ? ?/sec    1.00  1275.0±23.98ns        ? ?/sec
escape_text/escaped_chars_short                                        1.00    491.5±8.35ns        ? ?/sec    1.07   526.6±10.80ns        ? ?/sec
escape_text/no_chars_to_escape_long                                    2.06  1831.1±36.31ns        ? ?/sec    1.00   887.1±17.00ns        ? ?/sec
escape_text/no_chars_to_escape_short                                   1.02     16.7±0.30ns        ? ?/sec    1.00     16.4±0.31ns        ? ?/sec
parse_document_nocopy/document.xml                                     1.00     87.4±1.66µs   125.8 MB/sec    1.01     87.8±1.76µs   125.1 MB/sec
parse_document_nocopy/libreoffice_document.fodt                        1.00    325.5±5.91µs   167.7 MB/sec    1.01    329.7±5.93µs   165.6 MB/sec
parse_document_nocopy/linescore.xml                                    1.00     21.6±0.34µs   163.6 MB/sec    1.00     21.5±0.38µs   163.9 MB/sec
parse_document_nocopy/players.xml                                      1.00    127.0±2.00µs   114.1 MB/sec    1.00    126.5±2.28µs   114.6 MB/sec
parse_document_nocopy/rpm_filelists.xml                                1.00     63.7±1.14µs   172.5 MB/sec    1.03     65.6±1.14µs   167.5 MB/sec
parse_document_nocopy/rpm_other.xml                                    1.00    106.5±2.01µs   207.9 MB/sec    1.00    107.0±1.91µs   206.9 MB/sec
parse_document_nocopy/rpm_primary.xml                                  1.00    140.0±2.44µs   144.8 MB/sec    1.03    144.1±2.66µs   140.6 MB/sec
parse_document_nocopy/rpm_primary2.xml                                 1.00     45.5±0.89µs   157.6 MB/sec    1.01     46.1±0.80µs   155.5 MB/sec
parse_document_nocopy/sample_1.xml                                     1.00      7.8±0.13µs   140.6 MB/sec    1.03      8.0±0.13µs   137.1 MB/sec
parse_document_nocopy/sample_ns.xml                                    1.00      5.6±0.10µs   129.0 MB/sec    1.01      5.7±0.10µs   127.6 MB/sec
parse_document_nocopy/sample_rss.xml                                   1.00   555.7±10.46µs   339.4 MB/sec    1.03   573.1±10.46µs   329.1 MB/sec
parse_document_nocopy/test_writer_ident.xml                            1.00     19.4±0.32µs   218.8 MB/sec    1.00     19.4±0.35µs   218.7 MB/sec
parse_document_nocopy_with_namespaces/document.xml                     1.00    133.0±2.54µs    82.7 MB/sec    1.05    139.4±2.32µs    78.8 MB/sec
parse_document_nocopy_with_namespaces/libreoffice_document.fodt        1.00    507.2±8.56µs   107.6 MB/sec    1.08   546.2±10.20µs   100.0 MB/sec
parse_document_nocopy_with_namespaces/linescore.xml                    1.00     27.6±0.45µs   127.8 MB/sec    1.03     28.3±0.60µs   124.6 MB/sec
parse_document_nocopy_with_namespaces/players.xml                      1.00    158.0±2.95µs    91.8 MB/sec    1.02    160.6±2.72µs    90.3 MB/sec
parse_document_nocopy_with_namespaces/rpm_filelists.xml                1.00     87.2±1.64µs   126.0 MB/sec    1.14     99.2±1.74µs   110.7 MB/sec
parse_document_nocopy_with_namespaces/rpm_other.xml                    1.00    139.6±2.83µs   158.5 MB/sec    1.07    148.7±2.71µs   148.9 MB/sec
parse_document_nocopy_with_namespaces/rpm_primary.xml                  1.00    190.5±3.43µs   106.4 MB/sec    1.09    207.9±3.79µs    97.5 MB/sec
parse_document_nocopy_with_namespaces/rpm_primary2.xml                 1.00     61.7±1.10µs   116.2 MB/sec    1.09     67.5±1.28µs   106.2 MB/sec
parse_document_nocopy_with_namespaces/sample_1.xml                     1.00     10.5±0.20µs   105.0 MB/sec    1.06     11.1±0.21µs    99.3 MB/sec
parse_document_nocopy_with_namespaces/sample_ns.xml                    1.00      8.4±0.16µs    86.5 MB/sec    1.08      9.0±0.18µs    80.0 MB/sec
parse_document_nocopy_with_namespaces/sample_rss.xml                   1.00   786.4±13.46µs   239.8 MB/sec    1.09   859.9±12.82µs   219.3 MB/sec
parse_document_nocopy_with_namespaces/test_writer_ident.xml            1.00     29.0±0.55µs   146.4 MB/sec    1.06     30.8±0.55µs   138.0 MB/sec
read_event/trim_text = false                                           1.00    199.3±3.59µs        ? ?/sec    1.10    218.5±3.98µs        ? ?/sec
read_event/trim_text = true                                            1.00    190.4±3.76µs        ? ?/sec    1.11    211.7±4.11µs        ? ?/sec
unescape_text/char_reference                                           1.01    270.4±5.27ns        ? ?/sec    1.00    268.7±5.46ns        ? ?/sec
unescape_text/entity_reference                                         1.00    399.6±7.31ns        ? ?/sec    1.04    417.4±7.35ns        ? ?/sec
unescape_text/mixed                                                    1.00    364.4±7.30ns        ? ?/sec    1.01    367.5±7.24ns        ? ?/sec
unescape_text/no_chars_to_unescape_long                                1.00     71.1±1.33ns        ? ?/sec    1.00     70.9±1.37ns        ? ?/sec
unescape_text/no_chars_to_unescape_short                               1.00     11.8±0.21ns        ? ?/sec    1.06     12.4±0.23ns        ? ?/sec

Only >=5% diff:

> critcmp master element-parser -t 5
group                                                              element-parser                         master
-----                                                              --------------                         ------
NsReader::read_resolved_event_into/trim_text = false               1.00    398.9±6.30µs        ? ?/sec    1.05    419.6±7.94µs        ? ?/sec
NsReader::read_resolved_event_into/trim_text = true                1.00    382.1±7.06µs        ? ?/sec    1.06    404.0±7.44µs        ? ?/sec
One event/CData                                                    1.00     56.3±0.97ns        ? ?/sec    1.21     68.1±1.35ns        ? ?/sec
One event/Comment                                                  1.00    141.2±2.52ns        ? ?/sec    1.14    161.4±2.79ns        ? ?/sec
decode_and_parse_document_with_namespaces/rpm_filelists.xml        1.00     95.1±1.45µs   115.5 MB/sec    1.07    102.2±1.65µs   107.5 MB/sec
escape_text/escaped_chars_long                                     1.42  1806.4±34.20ns        ? ?/sec    1.00  1275.0±23.98ns        ? ?/sec
escape_text/escaped_chars_short                                    1.00    491.5±8.35ns        ? ?/sec    1.07   526.6±10.80ns        ? ?/sec
escape_text/no_chars_to_escape_long                                2.06  1831.1±36.31ns        ? ?/sec    1.00   887.1±17.00ns        ? ?/sec
parse_document_nocopy_with_namespaces/libreoffice_document.fodt    1.00    507.2±8.56µs   107.6 MB/sec    1.08   546.2±10.20µs   100.0 MB/sec
parse_document_nocopy_with_namespaces/rpm_filelists.xml            1.00     87.2±1.64µs   126.0 MB/sec    1.14     99.2±1.74µs   110.7 MB/sec
parse_document_nocopy_with_namespaces/rpm_other.xml                1.00    139.6±2.83µs   158.5 MB/sec    1.07    148.7±2.71µs   148.9 MB/sec
parse_document_nocopy_with_namespaces/rpm_primary.xml              1.00    190.5±3.43µs   106.4 MB/sec    1.09    207.9±3.79µs    97.5 MB/sec
parse_document_nocopy_with_namespaces/rpm_primary2.xml             1.00     61.7±1.10µs   116.2 MB/sec    1.09     67.5±1.28µs   106.2 MB/sec
parse_document_nocopy_with_namespaces/sample_1.xml                 1.00     10.5±0.20µs   105.0 MB/sec    1.06     11.1±0.21µs    99.3 MB/sec
parse_document_nocopy_with_namespaces/sample_ns.xml                1.00      8.4±0.16µs    86.5 MB/sec    1.08      9.0±0.18µs    80.0 MB/sec
parse_document_nocopy_with_namespaces/sample_rss.xml               1.00   786.4±13.46µs   239.8 MB/sec    1.09   859.9±12.82µs   219.3 MB/sec
parse_document_nocopy_with_namespaces/test_writer_ident.xml        1.00     29.0±0.55µs   146.4 MB/sec    1.06     30.8±0.55µs   138.0 MB/sec
read_event/trim_text = false                                       1.00    199.3±3.59µs        ? ?/sec    1.10    218.5±3.98µs        ? ?/sec
read_event/trim_text = true                                        1.00    190.4±3.76µs        ? ?/sec    1.11    211.7±4.11µs        ? ?/sec
unescape_text/no_chars_to_unescape_short                           1.00     11.8±0.21ns        ? ?/sec    1.06     12.4±0.23ns        ? ?/sec

escape_text results floats even at master

This is the last PR before release 0.32.0.

@Mingun Mingun assigned dralley and unassigned dralley Jun 8, 2024
@Mingun Mingun requested a review from dralley June 8, 2024 11:31
@codecov-commenter
Copy link

codecov-commenter commented Jun 8, 2024

Codecov Report

Attention: Patch coverage is 96.33028% with 4 lines in your changes missing coverage. Please review.

Project coverage is 61.83%. Comparing base (5d76174) to head (86bc3cd).
Report is 27 commits behind head on master.

Files Patch % Lines
src/reader/buffered_reader.rs 86.20% 4 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #754      +/-   ##
==========================================
+ Coverage   61.24%   61.83%   +0.58%     
==========================================
  Files          39       41       +2     
  Lines       16277    16720     +443     
==========================================
+ Hits         9969    10338     +369     
- Misses       6308     6382      +74     
Flag Coverage Δ
unittests 61.83% <96.33%> (+0.58%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Mingun Mingun force-pushed the element-parser branch 2 times, most recently from 0b00592 to 5a03d13 Compare June 8, 2024 12:21
src/reader/pi.rs Outdated Show resolved Hide resolved
/// Returns a slice of data read up to end of processing instruction (`>`),
/// which does not include into result (`?` at the end included).
/// Returns a slice of data read up to end of a chunk, which does not include
/// into result.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace with

Returns a slice of data read up to the end of a chunk, which is not included in the result

But it also now sounds a bit overgeneralized. It's not clear what constitutes "the end of a chunk" - that depends on the type of parser obviously, but it could be clearer.

Maybe instead of "the end of a chunk" say "the end of the {object / article / a synonym that isn't element or entity since they already mean something in XML} being parsed"

I'm just giving suggestions, feel free to adjust if you see fit

///
/// If input (`Self`) is exhausted and nothing was read, returns `None`.
/// If input (`Self`) is exhausted and nothing was read, returns `SyntaxError`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"nothing was read" is likewise not very specific without the context that was removed. Whichever thing this parser was trying to read, was not read completely.

@@ -275,8 +275,8 @@ impl<'a> XmlSource<'a, ()> for &'a [u8] {
}
}

fn read_pi(&mut self, _buf: (), position: &mut usize) -> Result<&'a [u8]> {
let mut parser = PiParser::default();
fn read<P: Parser>(&mut self, _buf: (), position: &mut usize) -> Result<&'a [u8]> {
Copy link
Collaborator

@dralley dralley Jun 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read<P: Parser>() is a very general name, perhaps overly so - how will this type of pattern work with TextEvents? Do you plan on a TextParser? If it only applies to a few specific circumstances, then read() is an awkward name for it .

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not known yet is there will be TextParser or not. I use "parser" structs to store state which is required to correctly understand where the corresponding event ends. Right now when we parse text there is no need to store any state, the first < starts markup. This probably will be changed, because entity references (&something;) can insert XML nodes into the text (that is why they are called "parsed entities" in the specification, because the result of their resolution is a text which is parsed into XML node) and it seems that for correct parsing we should resolve them immediately when their are encountered.

For now this is a my experimentation to modularize parser to reuse parts of them in DTD parsing. So the selected name could be a bit random, although I think, I can name it read_with which would even look more good from grammatical point of view.

/// After successful search the parser will return [`Some`] with position of
/// found symbol. If search is unsuccessful, a [`None`] will be returned. You
/// typically would expect positive result of search, so that you should feed
/// new data until yo'll get it.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// new data until yo'll get it.
/// new data until you get it.

I quite like the idea of separating parsing from IO to the greatest extent possible, perhaps it will allow simplifying the duplication (and macros) caused by the async / sync split.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that was my thoughts too. I also have implemented CommentParser and CDataParser which are looked for --> and ]]> correspondently (you can see them in #690) and now trying to integrate them into the parsing loop without performance drawdown if possible (I found that this was hard to do).

…ctions

The parser search the end of processing instruction and this is the last byte of it
Return error from read_pi function instead of returning flag and later converting it to the error
(Review in whitespace changes ignored mode)
(Review in whitespace changes ignored mode)
@Mingun
Copy link
Collaborator Author

Mingun commented Jun 9, 2024

@dralley, I hope I addressed all your comments, please look again. I also realized that requiring Parser: Default is unnecessary restriction and removed it

@Mingun Mingun requested a review from dralley June 9, 2024 06:12
They are identical except different type of parser used.
Related: tafia#678

All methods called only once or two and inlining them in most cases increases performance
of our benchmarks:

> critcmp master element-parser -t 5
group                                                              element-parser                         master
-----                                                              --------------                         ------
NsReader::read_resolved_event_into/trim_text = false               1.00    398.9±6.30µs        ? ?/sec    1.05    419.6±7.94µs        ? ?/sec
NsReader::read_resolved_event_into/trim_text = true                1.00    382.1±7.06µs        ? ?/sec    1.06    404.0±7.44µs        ? ?/sec
One event/CData                                                    1.00     56.3±0.97ns        ? ?/sec    1.21     68.1±1.35ns        ? ?/sec
One event/Comment                                                  1.00    141.2±2.52ns        ? ?/sec    1.14    161.4±2.79ns        ? ?/sec
decode_and_parse_document_with_namespaces/rpm_filelists.xml        1.00     95.1±1.45µs   115.5 MB/sec    1.07    102.2±1.65µs   107.5 MB/sec
escape_text/escaped_chars_long                                     1.42  1806.4±34.20ns        ? ?/sec    1.00  1275.0±23.98ns        ? ?/sec
escape_text/escaped_chars_short                                    1.00    491.5±8.35ns        ? ?/sec    1.07   526.6±10.80ns        ? ?/sec
escape_text/no_chars_to_escape_long                                2.06  1831.1±36.31ns        ? ?/sec    1.00   887.1±17.00ns        ? ?/sec
parse_document_nocopy_with_namespaces/libreoffice_document.fodt    1.00    507.2±8.56µs   107.6 MB/sec    1.08   546.2±10.20µs   100.0 MB/sec
parse_document_nocopy_with_namespaces/rpm_filelists.xml            1.00     87.2±1.64µs   126.0 MB/sec    1.14     99.2±1.74µs   110.7 MB/sec
parse_document_nocopy_with_namespaces/rpm_other.xml                1.00    139.6±2.83µs   158.5 MB/sec    1.07    148.7±2.71µs   148.9 MB/sec
parse_document_nocopy_with_namespaces/rpm_primary.xml              1.00    190.5±3.43µs   106.4 MB/sec    1.09    207.9±3.79µs    97.5 MB/sec
parse_document_nocopy_with_namespaces/rpm_primary2.xml             1.00     61.7±1.10µs   116.2 MB/sec    1.09     67.5±1.28µs   106.2 MB/sec
parse_document_nocopy_with_namespaces/sample_1.xml                 1.00     10.5±0.20µs   105.0 MB/sec    1.06     11.1±0.21µs    99.3 MB/sec
parse_document_nocopy_with_namespaces/sample_ns.xml                1.00      8.4±0.16µs    86.5 MB/sec    1.08      9.0±0.18µs    80.0 MB/sec
parse_document_nocopy_with_namespaces/sample_rss.xml               1.00   786.4±13.46µs   239.8 MB/sec    1.09   859.9±12.82µs   219.3 MB/sec
parse_document_nocopy_with_namespaces/test_writer_ident.xml        1.00     29.0±0.55µs   146.4 MB/sec    1.06     30.8±0.55µs   138.0 MB/sec
read_event/trim_text = false                                       1.00    199.3±3.59µs        ? ?/sec    1.10    218.5±3.98µs        ? ?/sec
read_event/trim_text = true                                        1.00    190.4±3.76µs        ? ?/sec    1.11    211.7±4.11µs        ? ?/sec
unescape_text/no_chars_to_unescape_short                           1.00     11.8±0.21ns        ? ?/sec    1.06     12.4±0.23ns        ? ?/sec
@dralley dralley merged commit e6f7be4 into tafia:master Jun 9, 2024
6 checks passed
@Mingun Mingun deleted the element-parser branch June 9, 2024 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants