Team-Project 4

The Couch Vibes - TV, Couch & Chill

Background & General Info

Television became a modern-day necessity towards the second half of the 20th century. Since the 1950s, it has only increased in terms of content, broadcasting networks, and hours of entertainment.

Today, American networks play thousands of different programs every day. Some programs/shows like Seinfeld, Friends, Breaking Bad, Game of Thrones, etc. go on to define an entire generation while some slip away into oblivion due to several reasons.

As a viewer, it also becomes very difficult to decide what to watch and wherewith the limited time that we have these days. For our final project "Couch Vibes", we used a supervised machine learning model to address these two points.

In our first supervised model "Show Success Predictor Model", we utilize different mood reactions in our dataset to predict the number of followers that a show can reach and whether it will achieve success or not. This model is particularly helpful for network executives to determine whether a show will achieve success or not based on reactions to its pilot.

In our second supervised model "Network Predictor Model", we suggest a network to the user based on their current mood. We utilize the same mood reactions as before as an input in our model. Additionally, we also provide users with a dropdown to select a preferred runtime, so that we can recommend 5 Popular shows for the suggested network. This model is helpful for viewers to determine what they want to see when they unwind at the end of the day.

Data sources:

For our website, we used TV Times Shows Dataset that has 2 csv files (all_episodes.csv and tvtimeshows.csv). All_episodes.csv had 413k rows and 14 columns.Tvtimeshows.csv had 19k rows and 10 columns. While going through our data we were able to decide that we will do 2 different machine learning models one for creators and one for audience and visualizations around the popular shows based on times watched, followers, runtime, either in prime time or out of prime time.

For our project, we had different data sets for each machine learning models and visualizations and to deploy our website in Heroku, we used a reduced dataset from model 2 for Heroku deployment.

Technologies

Python (Jupyter Notebook, Pandas)
Flask
SQL/ Postgres DB
JavaScript Libraries (D3, Plotly, Leaflet)
HTML/ CSS (Bootstrap)
Csv
Machine Learning
Tableau

Machine Learning - Model 1 - "The Show Success Predictor Model"

You can see the details below what we have done for the Show Creators:

Show success predictor
Predict the show success based on the moods of the show
Features: Mood of the show on the range of 1-100
Label: Number of followers of the show
Regression model

These are the steps:

Data Cleaning
- Removing extraneous columns
- Null values
- Duplicates
- Merging datasets by grouping shows
Individual scatter plots for each feature
Regularization: Coefficient plots with Linear Regression, Lasso, Ridge & ElasticNet
Function to test the familiar Regressors
Scaling the data – StandScaler & MinMaxScaler
Model Selection & Prediction
MSE & R-squared score: Lesson Learnt
Exported the model -> Flask implementation

Regularization: Coefficient plots with Linear Regression, Lasso, Ridge & ElasticNet

Scores of Show Success Predictor Model:

Model Comparison for Show Success Predictor is the Random Forest Regressor

Machine Learning - Model 2 - "The Network Predictor Model"

You can see the details below what we have done for the Network Model:

Network Predictor
Suggest the Network to watch based on the mood of the user
Features:Moods of the user in the range 1-100
Label: Network
Classification model

These are the steps:

Data Cleaning
- Removing extra columns, Null values, Duplicates
- Filtered based on specific runtime (15-90 mins), Top 10 US Networks, Number of episodes
- Removed special entries like sports events, live events, awards, etc.
- Removed all entries where all mood reactions were 0
- Removed data related to Netflix as it was skewing our entire data and model due to several entries
- Merging datasets by grouping shows
Label Encoding of Labels (Networks) and MinMaxScaler scaling (between 1-100) of input data (Moods)
Splitting Training & Testing Data
Hyper tuning parameters, Model, Fit & Predict Classification models
Exported the model -> Flask implementation
Reverse Label Encoding in Flask to display Network Name

Scores of Network Predictor Model:

Model Comparison for Network Predictor is Extra Trees Classifier (Hyper tuned model)

Deploy

In visual studio code using javascript, we used our data set as cvs files.

We later deployed flask app to run our models on local server. We also used SQL database for Network Predictor Model for tv shows to create tables and reduced dataset.

We were able to create visualizations in Tableau, embed a Tableau story on to a HTML for our website.

Website Design

We created one Landing page and 4 other pages using Bootstrap and CSS as following:

Landing page with 4 links to other pages.
Network page, which gives network suggestion based on model. Uses sliders and dropdown for parameters.
Followers page, which predicts whether a show will be successful or note based on model. Uses sliders for parameters.
Visualizations page that displays analysis using Tableau visualizations.
Team page that has team information.

Landing page

Visualizations & Analysis

We created several visualizations and made one story in Tableau to show TV Broadcasting Trends in USA. We used different dataset for our visualizations which contains 24 columns and 81k rows.We did not clean up visualization to see whatever available as visualizations from 1927 to 2020. We used group, set and parameter function to have concise graphs. We used trend lines, charts, treemaps for our graphs. Based on our data set, these are the visualizations that we created in Tableau:

The trends lines of TV consumption over the years in USA based on times watched and followers
The trend lines by top channels over the year as per followers, filter option by year slider
The most watched shows by channels,filter option by prime time in or out
The Flash (2014) - The most watched show review: The trends of total mood and Times Watched by episode and trend lines of total reaction to the show.
Top channels and shows in prime time (8-11pm) as per times watched in bar chart and followers in line graph.
The most popular shows by channel as per runtime - out of prime time
Top 10 shows based on prime time (between 8-11pm) by runtime group (15-30 min, 31-45 min and 46-60min). All of shows are in the CW between 31-45 min runtime in prime time slot.
Filter by run time and year, find the shows. Run time filter by 20-90 mins.

You can visit our website to see more visualizations.

Heroku

Couch Vibes Heroku Link. We were able to deploy our app to Heroku.

Lessons Learned

Getting runtime dropdown in the network predictor page was a challenge. The JavaScript code was complicated and needed to filter network and runtime parameters and sort it based on the followers to get the show name. In the end we wrote a sql query that would pass our data in a descending sort order based on followers. We then wrote a javascript part where then filtered based on the other two parameters.
In terms of Machine Learning, we learnt that when data of one category is enormous, it'll skew the model and will always predict that category. In our case it was Netflix that had varied data that we had to exclude to make our model dataset balanced.
We had quite an issue while deploying on Heroku as our model files (in joblib formats) were very large in size and we were getting error in heroku about pickle and unpickling. We compressed our model files using bz2file python library and modified our requirements file for heroku with bz2file and pickle library. This enabled us to deploy our app on heorku.

Inspiration

Our inspiration for this project were Rutgers Data Science Bootcamp and Good TV Shows.

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
Data		Data
Readme-images		Readme-images
US_Data		US_Data
static		static
templates		templates
.gitattributes		.gitattributes
.gitignore		.gitignore
DataCleaning_ShowSuccess.ipynb		DataCleaning_ShowSuccess.ipynb
ElasticNet Coefficient Plot.png		ElasticNet Coefficient Plot.png
Procfile		Procfile
README.md		README.md
Rutgers DS Bootcamp - The Couch Vbes.pptx		Rutgers DS Bootcamp - The Couch Vbes.pptx
ShowSuccess_Predictor.ipynb		ShowSuccess_Predictor.ipynb
app.py		app.py
config.py		config.py
network_predictor.h5		network_predictor.h5
network_predictor.joblib		network_predictor.joblib
network_predictor.pbz2		network_predictor.pbz2
requirements.txt		requirements.txt
setup_db.ipynb		setup_db.ipynb
setup_db.sql		setup_db.sql
show-predictor.ipynb		show-predictor.ipynb
show-predictor1.ipynb		show-predictor1.ipynb
showsuccess_predictor.pbz2		showsuccess_predictor.pbz2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Team-Project 4

The Couch Vibes - TV, Couch & Chill

Table of Contents

Background & General Info

Data sources:

Technologies

Machine Learning - Model 1 - "The Show Success Predictor Model"

Scores of Show Success Predictor Model:

Machine Learning - Model 2 - "The Network Predictor Model"

Scores of Network Predictor Model:

Deploy

Website Design

Visualizations & Analysis

Heroku

Lessons Learned

Inspiration

Team Members

About

Uh oh!

Releases

Packages

Contributors 5

Uh oh!

Languages

meetk5/Team-Project4

Folders and files

Latest commit

History

Repository files navigation

Team-Project 4

The Couch Vibes - TV, Couch & Chill

Table of Contents

Background & General Info

Data sources:

Technologies

Machine Learning - Model 1 - "The Show Success Predictor Model"

Scores of Show Success Predictor Model:

Machine Learning - Model 2 - "The Network Predictor Model"

Scores of Network Predictor Model:

Deploy

Website Design

Visualizations & Analysis

Heroku

Lessons Learned

Inspiration

Team Members

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Uh oh!

Languages

Packages