Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 564 Bytes

README.md

File metadata and controls

5 lines (3 loc) · 564 Bytes

Old Church Slavonic POS-tagger

This is a side project that started as a final task for the course "NLP for Low-Resourced and Endangered Languages" @ University of Helsinki. The repository contains a DistillBERT-based POS-tagger transformer model for Old Church Slavonic wrapped in FastAPI and Docker. Training data was obtained from the Old Church Slavonic UD treebank: https://github.com/UniversalDependencies/UD_Old_Church_Slavonic-PROIEL. The model only works with the encoding present in this treebank.

"chu" is the ISO 639-3 code for Old Church Slavonic.