DSCI 691 NLP Group Project

That's What Who Said

This repo contains the code for a collaborative project between 3- Drexel University Grad students, that sought to create a tool to identify speakers in multi-party dialogues using text-based features.

Our motivation was to determine `who said what`

It's important to note that the FCC requirements for speaker identification in closed captioning are intended to ensure accessiblity and equal participation for individuals with hearing impairmaents. By accurately identifying speakers, viewers who rely on closed captionin can better understand and follow conversations, enhancing their overall viewing experience

Description

We conducted five experiments using two pre-trained transformer-based modles (DistilBERT and RoBERTa) to predict if the speaker of a line of dialogue from the television show, The Office, was either "Dwight" or "Not Dwight".

THE QUESTION:

"Dwight" or "Not Dwight"?

TASK
DATA
DATA PREPROCESSING & VISUALIZATION
MACHINE LEARNING
PROJECT REPORT

Installation & Usage

To rerun this project

Clone this repository
Set up your project directories as per the file tree (below)
Model files for all five models are 4.9GB, so download with caution.
step through the 02_transformer_model.ipynb

Credits

Thanks to the good people at:

Deepnote

Kaggle

Hugging Face

Our Team

Kelsey Fox

GitHub

Justin Minnion

LinkedIn profile
GitHub

Chris Chavez

LinkedIn profile
GitHub

License

Please refer to:

Hugging Face Privacy Policy for Hugging Face's consent to the terms of usage of their products.

Kaggle Privacy Policy for Kaggle's consent to the terms of usage of their products.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

DSCI 691 NLP Group Project

That's What Who Said

Our motivation was to determine `who said what`

Description

THE QUESTION:

Installation & Usage

Credits

Our Team

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

DSCI 691 NLP Group Project

That's What Who Said

Our motivation was to determine who said what

Description

THE QUESTION:

Installation & Usage

Credits

Our Team

License

Our motivation was to determine `who said what`