Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for order=1 to TrieModel #26

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

changq
Copy link

@changq changq commented Nov 20, 2014

change it to support ngram-order=1 (tested for TrieModel)

  1. don't throw when order=1
  2. do some initialization only when order > 1 to avoid segfault
  3. set default RecordReader::remains_ to false to pass a later conditional throw on ngram-order

@kpu
Copy link
Owner

kpu commented Nov 22, 2014

Not the most efficient way (the unigrams will have useless trie pointers and backoff). Should really be its own implementation but Holger Schwenk wants something too. . . so sure.

@changq
Copy link
Author

changq commented Nov 26, 2014

Yes, I understand that a separate implementation is more appropriate in terms of efficiency. In building a classifier, I need to use from unigram to pentagram as features. It will be wonderful to use KenLM as a unified solution to manage these features. So I've made the code change and submit this pull request for your consideration.

@lpcauch
Copy link

lpcauch commented Aug 2, 2019

Hey @changq ,

How do you create a lm.binary after you create your unigram ?
I succeeded to create the unigram but can't create the lm.binary (it throws an error "Was expecting n-gram header \1-grams: but got \end\ instead Byte: 209").
I just download your git, no modification have been done so I don't know what to do :/
Any helps is welcome !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants