This repository contains the code and materials for our Machine Learning project on "Multi-Class Prediction of Obesity Risk". In this project, we focused on predicting the risk of obesity using a multi-class classification approach. Our work involved various stages including exploratory data analysis (EDA), feature engineering, and building predictive models using a diverse set of machine learning algorithms and choose the best model for our pipeline.
- Exploratory Data Analysis (EDA): We thoroughly examined the dataset to understand its underlying patterns and distributions.
- Feature Engineering: We engineered relevant features to enhance the predictive power of our models.
- Modeling: We implemented multiple machine learning models including Logistic Regression, Decision Tree, Random Forest, SVC, KNN Classifier, XGBoost, LGBM, Catboost, and Adaboost and choose the best one.
- Pipeline: We utilized a pipeline to streamline our machine learning workflow and ensure reproducibility.
Data/
: Contains the dataset used in the project, the submission file and the Presentation slides summarizing our project findings.Preprocessing/
: unfinished Jupyter notebooks containing the code for EDA, feature engineering, and modeling and picture of submission on Kaggle competition.Multi_Class Prediction of Obesity Risk
: Jupyter notebooks containing the code for EDA, feature engineering, and modeling.README.md
: You are here! It provides an overview of the project and instructions for replicating our work.
To replicate our project, follow these steps:
- Clone this repository to your local machine.
- Navigate to the
Multi_Class Prediction of Obesity Risk
Jupyter notebook. - Open the Jupyter notebooks and execute the code cells sequentially.
- Refer to the presentation slides in the
Data/
directory for a summary of our findings.
- Kaggle: [Kaggle Competition]
This project was created by: