Skip to content

Latest commit

 

History

History
 
 

LanguageDetection

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Language Detection Model

Data set : https://www.kaggle.com/datasets/basilb2s/language-detection

Steps Followed

1. Load the data set
2. Encode the labels into categoical form
3. Pre-process the Text content
4. Tokenizing
5. Create a Dictionary for Vocabulary
6. Count the Word Frequencies (Unigrams were considered)
7. Split the dataset into train and test sets
8. Perform Supervised Classification

Could achieve 97.3% accuracy using Naive Bayes Classifier