Tip Genie: NYC Taxi Tip Prediction using Scikit-Learn and Snap ML

In this exercise session, we consolidated our machine learning (ML) modelling skills by using a popular classification model to predict taxi tips. The model used is Decision Tree. We used a real dataset to train each of these models. The dataset includes information about taxi tips and was collected and provided to the NYC Taxi and Limousine Commission (TLC) by technology providers authorized under the Taxicab & Livery Passenger Enhancement Programs (TPEP/LPEP). We used the trained model to predict the amount of tips paid.

In the current exercise session, we practised the Scikit-Learn Python interface and the Python API offered by the Snap Machine Learning (Snap ML) library.

Scikit-learn is a free, open-source and popular machine-learning library for Python. It features various classification, regression and clustering algorithms including support-vector machines, random forests, and k-means. It is easy to use and has a well-documented API. It is constantly being updated with new features and algorithms. This makes it a valuable tool for data scientists and machine learning practitioners.

Snap ML is a high-performance IBM library for ML modelling. It provided highly-efficient CPU/GPU implementations of linear models and tree-based models. Snap ML not only accelerated ML algorithms through system awareness, but it also offered novel ML algorithms with best-in-class accuracy.

It was exciting to learn how to use these two popular classification models to detect fraudulent credit card transactions. We believe that this knowledge would be valuable in our future careers as data scientists. We also looked forward to practising the Scikit-Learn Python interface and the Snap ML library. We were confident this would help us become more proficient ML modellers. For more information, please visit snapml information page and SciKit-Learn information page.

Objectives

Perform basic data preprocessing using Scikit-Learn
Model a regression task using the Scikit-Learn and Snap ML Python APIs
Train a Decision Tree Regressor model using Scikit-Learn and Snap ML
Run inference and assess the quality of the trained models

Dataset

You can download the dataset form here.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
regression_tree.ipynb		regression_tree.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tip Genie: NYC Taxi Tip Prediction using Scikit-Learn and Snap ML

Objectives

Dataset

About

Releases

Packages

Languages

AdrijeGuha/Taxi-Tip-Prediction

Folders and files

Latest commit

History

Repository files navigation

Tip Genie: NYC Taxi Tip Prediction using Scikit-Learn and Snap ML

Objectives

Dataset

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages