Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix wrong disregarding of not closed markup, such as lone < #679

Merged
merged 6 commits into from
Nov 15, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ configuration is serializable.

### Bug Fixes

- [#622]: Fix wrong disregarding of not closed markup, such as lone `<`.

### Misc Changes

- [#675]: Minimum supported version of serde raised to 1.0.139
Expand All @@ -36,6 +38,7 @@ configuration is serializable.
- `Error::UnexpectedToken` replaced by `IllFormedError::DoubleHyphenInComment`

[#513]: https://github.com/tafia/quick-xml/issues/513
[#622]: https://github.com/tafia/quick-xml/issues/622
[#675]: https://github.com/tafia/quick-xml/pull/675
[#677]: https://github.com/tafia/quick-xml/pull/677

Expand Down
2 changes: 1 addition & 1 deletion src/reader/async_tokio.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,13 @@

use tokio::io::{self, AsyncBufRead, AsyncBufReadExt};

use crate::errors::{Error, Result, SyntaxError};
use crate::events::Event;
use crate::name::{QName, ResolveResult};
use crate::reader::buffered_reader::impl_buffered_source;
use crate::reader::{
is_whitespace, BangType, NsReader, ParseState, ReadElementState, Reader, Span,
};
use crate::{Error, Result};

/// A struct for read XML asynchronously from an [`AsyncBufRead`].
///
Expand Down
32 changes: 10 additions & 22 deletions src/reader/buffered_reader.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ use std::path::Path;

use memchr;

use crate::errors::{Error, Result};
use crate::errors::{Error, Result, SyntaxError};
use crate::events::Event;
use crate::name::QName;
use crate::reader::{is_whitespace, BangType, ReadElementState, Reader, Span, XmlSource};
Expand Down Expand Up @@ -54,7 +54,7 @@ macro_rules! impl_buffered_source {
byte: u8,
buf: &'b mut Vec<u8>,
position: &mut usize,
) -> Result<Option<&'b [u8]>> {
) -> Result<(&'b [u8], bool)> {
// search byte must be within the ascii range
debug_assert!(byte.is_ascii());

Expand Down Expand Up @@ -90,18 +90,14 @@ macro_rules! impl_buffered_source {
}
*position += read;

if read == 0 {
Ok(None)
} else {
Ok(Some(&buf[start..]))
}
Ok((&buf[start..], done))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done ought to be renamed to found

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, all that code will be removed soon, because I rewrite it, so not necessary to polish it. That changes was made only because I plan to make tests first to know that rewriting did not break anything

}

$($async)? fn read_bang_element $(<$lf>)? (
&mut self,
buf: &'b mut Vec<u8>,
position: &mut usize,
) -> Result<Option<(BangType, &'b [u8])>> {
) -> Result<(BangType, &'b [u8])> {
// Peeked one bang ('!') before being called, so it's guaranteed to
// start with it.
let start = buf.len();
Expand All @@ -115,7 +111,7 @@ macro_rules! impl_buffered_source {
match self $(.$reader)? .fill_buf() $(.$await)? {
// Note: Do not update position, so the error points to
// somewhere sane rather than at the EOF
Ok(n) if n.is_empty() => return Err(bang_type.to_err()),
Ok(n) if n.is_empty() => break,
Ok(available) => {
// We only parse from start because we don't want to consider
// whatever is in the buffer before the bang element
Expand All @@ -126,7 +122,7 @@ macro_rules! impl_buffered_source {
read += used;

*position += read;
break;
return Ok((bang_type, &buf[start..]));
} else {
buf.extend_from_slice(available);

Expand All @@ -143,19 +139,15 @@ macro_rules! impl_buffered_source {
}
}

if read == 0 {
Ok(None)
} else {
Ok(Some((bang_type, &buf[start..])))
}
Err(bang_type.to_err())
}

#[inline]
$($async)? fn read_element $(<$lf>)? (
&mut self,
buf: &'b mut Vec<u8>,
position: &mut usize,
) -> Result<Option<&'b [u8]>> {
) -> Result<&'b [u8]> {
let mut state = ReadElementState::Elem;
let mut read = 0;

Expand All @@ -172,7 +164,7 @@ macro_rules! impl_buffered_source {

// Position now just after the `>` symbol
*position += read;
break;
return Ok(&buf[start..]);
} else {
// The `>` symbol not yet found, continue reading
buf.extend_from_slice(available);
Expand All @@ -190,11 +182,7 @@ macro_rules! impl_buffered_source {
};
}

if read == 0 {
Ok(None)
} else {
Ok(Some(&buf[start..]))
}
Err(Error::Syntax(SyntaxError::UnclosedTag))
}

$($async)? fn skip_whitespace(&mut self, position: &mut usize) -> Result<()> {
Expand Down
Loading
Loading