Skip to content

digitalfondue/jfiveparse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

jfiveparse: a java html5 parser

Maven Central Build Status

jfiveparse pass all the non scripted tests for the tokenizer and tree construction from the html5lib-tests suite.

It provide both fragment and full document parsing. It can parse directly from a String or by streaming through a Reader (note: the encoding must be known, currently the parser does not implement an autodetect feature).

Requires java 11.

Javadoc@javadoc.io.

Why?

As far as I know, there is no pure java html5 parser that currently pass the html5lib-tests suite (well, the more relevant tests :D, note: this project was published in october 2015).

Additionally, I wanted a library with a reduced footprint (and no dependencies). Currently the jar weight around ~150kb. The target is to keep it under 200kb.

Performance should be competitive with other java parsers.

License

jfiveparse is licensed under the Apache License Version 2.0.

Download

maven:

<dependency>
    <groupId>ch.digitalfondue.jfiveparse</groupId>
    <artifactId>jfiveparse</artifactId>
    <version>1.1.1</version>
</dependency>

gradle:

compile 'ch.digitalfondue.jfiveparse:jfiveparse:1.1.1'

Use:

If you use it as a module, remember to add requires ch.digitalfondue.jfiveparse; in your mo