Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing full html documents #202

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Conversation

dev-ardi
Copy link

@dev-ardi dev-ardi commented Jul 19, 2024

Fixes #183

HTML5ever supports parsing full documents so let's expose that option.

What are the acceptable tags and attributes that we should support by default?

@notriddle
Copy link
Member

Instead of implicitly adding a bunch of new tags when switching mode, perhaps add a new constructor method that creates a builder with the flag turned on? Like this?

/// Create a new parser in "document mode",
/// instead of the default fragment mode.
///
/// In addition to the normal set of allowed tags,
/// this also enables `html`, `head`, `title`,
/// and `body`.
pub fn new_as_document() -> Builder {
    let mut result = Builder::new();
    result.is_document = true;
    result.add_tags(["html", "head", "title", "body"]);
    result
}

@dev-ardi
Copy link
Author

dev-ardi commented Jul 19, 2024

Maybe adding the allowed tags/attributes in the same operation as setting is_document is not a good idea because Builder::empty exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

'html', 'head' and 'body' tags are stripped out even if these are included in the whitelisted tags
2 participants