GitHub - ayushkr3301/Toxic-comment-classifier: Classifying a text comment as toxic or non-toxic using natural language processing.

Classifying a given text comment as toxic or non-toxic using natural language processing

First of all I got the data from an expired kaggle competition. The data consisited of more than hunderd thousand twitter comments each already classified as toxic or non-toxic.

I cleaned the data by removing puctuations, line breaks, stop words etc. and then divided it into training and testing set. Using scikit-learn with natural language processing I trained my training set and used that to predict results for testing set. I created seven classifiers: KNeighborsClassifier, DecisionTreeClassifier, RandomForestClassifier, LogisticRegression, SGDClassifier, MultinomialNB and SVC. All of these were combined in voting classifier which gave an accuracy of 94.05 on my test set.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
toxic comments classification.ipynb		toxic comments classification.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

ayushkr3301/Toxic-comment-classifier

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages