Skip to content

Latest commit

 

History

History
46 lines (36 loc) · 1.23 KB

README.md

File metadata and controls

46 lines (36 loc) · 1.23 KB

JHazm

A Java version of Hazm (Python library for digesting Persian text)

  • Text cleaning
  • Sentence and word tokenizer
  • Word lemmatizer
  • POS tagger
  • Dependency parser
  • Corpus readers for Hamshahri and Bijankhan

Installation and Using

To make a single jar file run this codes:

mvn clean compile assembly:single

For using this project as library in maven just use:

mvn clean install

To run and see the help:

java -jar jhazm-jar-with-dependencies.jar

For example to do POS Tag on bundled sample file use:

java -jar jhazm-jar-with-dependencies.jar -a partOfSpeechTagging -o test.txt

Or to run on any other file:

java -jar jhazm-jar-with-dependencies.jar -a partOfSpeechTagging -o test.txt -i input.txt

Or on some piece of text:

java -jar jhazm-jar-with-dependencies.jar -a partOfSpeechTagging -o test.txt -t "سلام من خوب هستم!"

Good Luck!