|
| 1 | +# BraTS Challenge - MLCube integration - preprocess |
| 2 | + |
| 3 | +Original implementation: ["BraTS Instructions Repo"](https://github.com/BraTS/Instructions) |
| 4 | + |
| 5 | +## Dataset |
| 6 | + |
| 7 | +Please refer to the [BraTS challenge page](http://braintumorsegmentation.org/) and follow the instructions in the data section. |
| 8 | + |
| 9 | +## Project setup |
| 10 | + |
| 11 | +```bash |
| 12 | +# Create Python environment and install MLCube Docker runner |
| 13 | +virtualenv -p python3 ./env && source ./env/bin/activate && pip install mlcube-docker |
| 14 | + |
| 15 | +# Fetch the boston housing example from GitHub |
| 16 | +git clone https://github.com/mlcommons/mlcube_examples && cd ./mlcube_examples |
| 17 | +git fetch origin pull/39/head:feature/brats && git checkout feature/brats |
| 18 | +cd ./brats/preprocessing/mlcube |
| 19 | +``` |
| 20 | + |
| 21 | +## Important files |
| 22 | + |
| 23 | +These are the most important files on this project: |
| 24 | + |
| 25 | +```bash |
| 26 | + |
| 27 | +├── mlcube |
| 28 | +│ ├── mlcube.yaml # MLCube configuration file, it defines the project, author, platform, docker and tasks. |
| 29 | +│ └── workspace |
| 30 | +│ ├── data |
| 31 | +│ │ └── BraTS_example_seg.nii.gz # Input data |
| 32 | +│ ├── results |
| 33 | +│ │ └── output.npy # Output processed data |
| 34 | +│ ├── parameters.yaml |
| 35 | +└── project |
| 36 | + ├── Dockerfile # Docker file with instructions to create the image for the project. |
| 37 | + ├── preprocess.py # Python file that contains the main logic of the project. |
| 38 | + ├── mlcube.py # Python entrypoint used by MLCube, contains the logic for MLCube tasks. |
| 39 | + └── requirements.txt # Python requirements needed to run the project inside Docker. |
| 40 | + └── run.sh # Bash file containing logic to call preprocess.py script. |
| 41 | +``` |
| 42 | + |
| 43 | +## How to modify this project |
| 44 | + |
| 45 | +You can change each file described above in order to add your own implementation. |
| 46 | + |
| 47 | +### Requirements file |
| 48 | + |
| 49 | +In this file (`requirements.txt`) you can add all the python dependencies needed for running your implementation, these dependencies will be installed during the creation of the docker image, this happens when you run the ```mlcube run ...``` command. |
| 50 | + |
| 51 | +### Dockerfile |
| 52 | + |
| 53 | +You can use both, CPU or GPU version for the dockerfile (`Dockerfile_CPU`, `Dockerfile_GPU`), also, you can add or modify any steps inside the file, this comes handy when you need to install some OS dependencies or even when you want to change the base docker image, inside the file you can find some information about the existing steps. |
| 54 | + |
| 55 | +### Parameters file |
| 56 | + |
| 57 | +This is a yaml file (`parameters.yaml`)that contains all extra parameters that aren't files or directories, for example, here you can place all the hyperparameters that you will use for training a model. This file will be passed as an **input parameter** in the MLCube tasks and then it will be read inside the MLCube container. |
| 58 | + |
| 59 | +### MLCube yaml file |
| 60 | + |
| 61 | +In this file (`mlcube.yaml`) you can find the instructions about the docker image and platform that will be used, information about the project (name, description, authors), and also the tasks defined for the project. |
| 62 | + |
| 63 | +In the existing implementation you will find 1 task: |
| 64 | + |
| 65 | +* evaluate: |
| 66 | + |
| 67 | + This task takes the following parameters: |
| 68 | + |
| 69 | + * Input parameters: |
| 70 | + * predictions: Folder path containing predictions |
| 71 | + * ground_truth: Folder path containing ground truth data |
| 72 | + * parameters_file: Extra parameters |
| 73 | + * Output parameters: |
| 74 | + * output_path: File path where output preprocess will be stored |
| 75 | + |
| 76 | + This task takes the input predictions and ground truth data, perform the evaluation and then save the output result in the output_path. |
| 77 | + |
| 78 | +### MLCube python file |
| 79 | + |
| 80 | +The `mlcube.py` file is the handler file and entrypoint described in the dockerfile, here you can find all the logic related to how to process each MLCube task. If you want to add a new task first you must define it inside the `mlcube.yaml` file with its input and output parameters and then you need to add the logic to handle this new task inside the `mlcube.py` file. |
| 81 | + |
| 82 | +### Preprocess file |
| 83 | + |
| 84 | +The `preprocess.py` file contains the main logic of the project, you can modify this file and write your implementation here to perform the different preprocessing steps, this preprocess file is called from the `run.sh` file and there are other ways to link your implementation and shown in the [MLCube examples repo](https://github.com/mlcommons/mlcube_examples). |
| 85 | + |
| 86 | +### Run bash file |
| 87 | + |
| 88 | +The `run.sh` file is called from `mlcube.py` and it receives the arguments, here we can perform different steps to then call the `preprocess.py` script. |
| 89 | + |
| 90 | +## Tasks execution |
| 91 | + |
| 92 | +```bash |
| 93 | +# Run preprocess task. |
| 94 | +mlcube run --mlcube=mlcube_cpu.yaml --task=preprocess |
| 95 | +``` |
| 96 | + |
| 97 | +We are targeting pull-type installation, so MLCube images should be available on Docker Hub. If not, try this: |
| 98 | + |
| 99 | +```Bash |
| 100 | +mlcube run ... -Pdocker.build_strategy=always |
| 101 | +``` |
0 commit comments