Skip to content

Sentiment Analysis of Patients' Blogs from various online forums and analysing them into categories exist(neutral sentiment), deteriorate(negative sentiment) and recover(positive sentiment). The classification is done with the help of Naive Bayes Probabilistic Classifier. The programming is done in R language.

Notifications You must be signed in to change notification settings

aparna0522/Sentiment-Analysis-Using-Text-Mining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 

Repository files navigation

Sentiment-Analysis-Using-Text-Mining

Brief Description about the project

Sentiment Analysis of Patients' Blogs from various online forums and analysing them into categories

Disease Exists (Neutral Sentiment)
Health Deteriorating (Negative Sentiment) 
Health Recovering (Positive Sentiment) 

The classification is done with the help of Naive Bayes Probabilistic Classifier. The final aim is to find the accuracies for various sets of training and testing datasets. The programming is done in R language.

Datasets Source: Online Website - https://patient.info/ (Educational purpose only)

How to run this project?

  1. Clone this repository.
  2. Create a database consisting of two columns: Label and Blogs
    In the "Label" column, the sentiment of the blog will be mentioned, i.e. Exists, Deteriorate or Recover.
    In the "Blogs" column, input the blogs from any online forums, or self articulated blogs from various sources.
  3. Open R compiler, run the entire code.

How to increase the accuracy?

  1. Increase or decrease the number of times the dataset is randomized, it can help in increasing the accuracy by 10% at most.
  2. Try to label the dataset more accurately.

Results

Results from the dataset considered show the sentiment scores for the given emotion (anger, anticipation, fear, ....)

Screenshot 2021-11-29 at 1 56 29 PM

Utilizing different proportions of training and testing datasets to find the accuracy changes

Screenshot 2021-11-29 at 1 57 04 PM

Screenshot 2021-11-29 at 1 57 17 PM

Accuracy verses the proportion of dataset used for training

Screenshot 2021-11-29 at 1 57 28 PM

About

Sentiment Analysis of Patients' Blogs from various online forums and analysing them into categories exist(neutral sentiment), deteriorate(negative sentiment) and recover(positive sentiment). The classification is done with the help of Naive Bayes Probabilistic Classifier. The programming is done in R language.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages