Skip to content

Tokenizer additional fixes and span method #36

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 319 commits into
base: master
Choose a base branch
from

Conversation

kmike
Copy link
Member

@kmike kmike commented Dec 23, 2016

Rebased version of #20.

tpeng and others added 30 commits October 16, 2013 15:03
also add default tag and features for open hours parser
 e.g. replace the header to more general tag.
 the idea is to have make websturct works on a "generalized" html.
add general inside_tag feature function
 checking by `grep -r -n --color -E '<br>.+</br>' *`
Annotate ie address pages and add nb to train ie address parser
kmike and others added 27 commits November 16, 2016 17:40
(backwards incompatible) switch to sklearn-crfsuite
recent Firefox comments out <base> tags when page is saved
@codecov-io
Copy link

codecov-io commented Dec 23, 2016

Current coverage is 82.08% (diff: 100%)

Merging #36 into master will increase coverage by 0.06%

@@             master        #36   diff @@
==========================================
  Files            32         32          
  Lines          1607       1613     +6   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits           1318       1324     +6   
  Misses          289        289          
  Partials          0          0          

Powered by Codecov. Last update 628c8c2...2be9325

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants