Skip to content

SamSamhuns/pytorch_project_template

Repository files navigation

Pytorch Project Template, Computer Vision

testsCodacy Badge

Python 3.10Python 3.11Python 3.12

This is a template for a PyTorch Project for training, testing, inference demo, and FastAPI serving along with Docker support.

Setup

Use python venv or a conda env to install requirements:

  • Install full train requirements: pip install -r requirements/train.txt
  • Install minimal inference requirements: pip install -r requirements/inference.txt

Train

Example training for mnist digit classification:

python train.py --cfg configs/mnist_config.yaml

Custom Training

Data Preparation

Set training data inside data directory in the following format:

data
|── SOURCE_DATASET
    ├── CLASS 1
    |   ├── img1
    |   └── img2
    |   ├── ...
    ├── CLASS 2
    |   ├── img1
    |   └── img2
    |   ├── ...

Note: ImageNet style class_dir->subdirs->subdirs->images... is also supported
# generate an id to name classmap
python scripts/generate_classmap_from_dataset.py --sd data/SOURCE_DATASET --mp data/ID_2_CLASSNAME_MAP_TXT_FILE

# create train val test split, also creates an index to classname mapping txt file
python scripts/train_val_test_split.py --rd data/SOURCE_DATASET --td data/SOURCE_DATASET_SPLIT --vs VAL_SPLIT_FRAC -ts TEST_SPLIT_FRAC

# OPTIONAL duplicate train data if necessary
python scripts/duplicate_data.py --rd data/SOURCE_DATASET_SPLIT/train --td data/SOURCE_DATASET_SPLIT/train -n TARGET_NUMBER

# create a custom config file based on configs/classifier_cpu_config.yaml and modify train parameters
cp configs/classifier_cpu_config.yaml configs/custom_classifier_cpu_config.yaml

Example Training: Image Classification

Sample data used in the custom image classification training downloaded from https://www.kaggle.com/datasets/umairshahpirzada/birds-20-species-image-classification.

# train on custom data with custom config
python train.py --cfg custom_classifier_cpu_config.yaml

WebDataset for large scale training

Convert existing dataset to a tar archive format used by WebDataset. The data directory must match the structure above.

# ID_2_CLASSNAME_MAP_TXT_FILE is generated using the scripts/train_val_test_split.py file
# convert train/val/test splits into tar archives
python scripts/convert_dataset_to_tar.py --sd data/SOURCE_DATA_SPLIT --td data/TARGET_TAR_SPLIT.tar --mp ID_2_CLASSNAME_MAP_TXT_FILE

An example configuration for training with the WebDataset format is provided in configs/classifier_webdataset_cpu_config.yaml.

# example training with webdataset tar data format
python train.py --cfg configs/classifier_webdataset_cpu_config.yaml

Test

Test based on CONFIG_FILE. By default testing is done for mnist classification.

python test.py --cfg CONFIG_FILE

Export

python export.py --cfg CONFIG_FILE -r MODEL_PATH --mode <"ONNX_TS"/"ONNX_DYNAMO"/"TS_TRACE"/"TS_SCRIPT">

TensorBoard logging

All tensorboard logs are saved in the tensorboard_log_dir setting in the config file. Logs include train/val epoch accuracy/loss, graph, and preprocessed images per epoch.

To start a tensorboard server reading logs from the experiment dir exposed on port localhost port 6007:

tensorboard --logdir=TF_LOG_DIR --port=6006

Inference

Docker

Install docker in the system first:

Training and testing

bash scripts/build_docker.sh  # builds the docker image
bash scripts/run_docker.sh    # runs the previous docker image creating a shared volume checkpoint_docker outside the container
# inside the docker container
python train.py

Using gpus inside docker for training/testing:

--gpus device=0,1 or all

For serving the model with FastAPI

bash server/build_server_docker.sh -m pytorch/onnx
bash server/run_server_docker.sh -h/--http 8080

Utility functions

Clean cached builds, pycache, .DS_Store files, etc:

bash scripts/cleanup.sh

Count number of files in sub-directories in PATH

bash scripts/count_files.sh PATH

Profiling PyTorch

Acknowledgements

About

Template for creating pytorch training and testing code

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published