-
Notifications
You must be signed in to change notification settings - Fork 1
createchmjob.py
createchmjob.py is a tool that makes it easy to process a large stack of images through CHM using HPC resources such as Comet and Rocce.
Example usage:
createchmjob.py ./images ./trainedmodel ./run --chmbin /bin/chm.img \
--cluster rocce
CHM requires TWO phases of processing. In the FIRST phase CHM tasks are run which analyze sets of tiles on the input images. This is done to parallelize the processing as well as reduce the memory footprint of CHM which gets huge on tiles larger then 1,000x1,000. For example, tiles of 500x500 easily use 4 to 6 gigabytes of ram. These tiles are stored on the filesystem under /chmrun/tiles/<image.png> directories.
In the SECOND phase, merge tasks are run which combine the tiles into what are known as probability maps. Probability maps are simply greyscale 8-bit images (values 0-255) of the same size as in the input images where the intensity of the pixel correlates to the probability that it belongs to the feature trained for in the trained model. The probability maps are stored in /chmrun/probmaps directory.
This example runs CHM on Rocce using a pre-trained model.
Open a terminal and connect to rocce via ssh. Replace <USER> with your username
ssh <USER>@rocce.ucsd.edu
cd /data/<USER>
mkdir -p testchm/images
cd testchm
wget https://github.com/wiki/CRBS/chmutil/data/model.tar.gz
tar -zxf model.tar.gz
cd images
wget https://github.com/wiki/CRBS/chmutil/images/bubbles_gray.png
wget https://github.com/wiki/CRBS/chmutil/images/bubbles_gray2.png
cd ..
If above is successful, running tree command should output the following:
tree
.
|-- images
| |-- bubbles_gray.png
| `-- bubbles_gray2.png
|-- model
| |-- MODEL_level0_stage1.mat
| |-- MODEL_level0_stage2.mat
| |-- MODEL_level1_stage1.mat
| |-- output_level0_stage1
| | |-- x.000.mat
| | `-- x.001.mat
| |-- output_level0_stage2
| | |-- x.000.mat
| | `-- x.001.mat
| |-- output_level1_stage1
| | |-- x.000.mat
| | `-- x.001.mat
| `-- param.mat
`-- model.tar.gz
In this step we run createchmjob.py that creates a set of files and directories under the directory in the 3rd argument of the command (in this case ./chmrun) that will run the CHM job. An explanation of the files and directories created can be found here
cd /data/<USER>/testchm
createchmjob.py ./images/ ./model ./chmrun \
--chmbin <path to chm singularity image> \
--disablechmhisteq --cluster rocce
Should output something similar to the following:
Run this to submit job
/home/<USER>/miniconda2/bin/checkchmjob.py "/data/<USER>/testchm/chmrun" --submit
Let's do what the output says just run the above command:
/home/<USER>/miniconda2/bin/checkchmjob.py "/data/<USER>/testchm/chmrun" --submit
Should output something similar to the following:
Analyzing job. This may take a minute...
chmutil version: 0.8.0
Tiles: 512x512 with 0x0 overlap
Disable histogram equalization in CHM: True
Tasks: 50 tiles per task, 1 tasks(s) per node
Trained CHM model: /data/<USER>/testchm/model
CHM binary: /data/<USER>/chm_s22.img
CHM tasks: 0% complete (0 of 2 completed)
Merge tasks: 0% complete (0 of 2 completed)
Run this:
cd "/data/<USER>/testchm/chmrun";qsub -t 1-2 runjobs.rocce
Run the command from the previous step:
cd "/data/<USER>/testchm/chmrun";qsub -t 1-2 runjobs.rocce
Should output something similar to the following:
Your job-array 12725.1-2:1 ("chmjob") has been submitted
Check for job to complete by invoking qstat and verifying all jobs with id from line above (12725 in above case) are no longer listed or have state 'c' for complete
Example:
qstat
Should output something similar to the following if jobs have NOT completed:
job-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
12725 0.50500 chmjob churas r 06/15/2017 15:35:36 all.q@compute-0-7.local 1 1
12725 0.50500 chmjob churas r 06/15/2017 15:35:36 all.q@compute-0-3.local 1 2
If jobs are done then qstat will NOT output anything
If all jobs have finished as denoted by qstat in previous step then lets run checkchmjob.py again to generate jobs to finish any failed CHM jobs or to generate merge jobs.
/home/<USER>/miniconda2/bin/checkchmjob.py "/data/<USER>/testchm/chmrun" --submit
If all CHM jobs finished then output from above command will be similar to this:
Analyzing job. This may take a minute...
chmutil version: 0.8.0
Tiles: 512x512 with 0x0 overlap
Disable histogram equalization in CHM: True
Tasks: 50 tiles per task, 1 tasks(s) per node
Trained CHM model: /data/<USER>/testchm/model
CHM binary: /data/<USER>/chm_s22.img
CHM tasks: 100% complete (2 of 2 completed)
Merge tasks: 0% complete (0 of 2 completed)
Run this:
cd "/data/<USER>/testchm/chmrun";qsub -t 1-2 runmerge.rocce
Run command output from previous step
cd "/data/<USER>/testchm/chmrun";qsub -t 1-2 runmerge.rocce
Just like before use qstat to wait for job completion. Once done repeat Step 7 running checkchmjob.py to make sure all steps have completed.
Output of checkchmjob.py with --submit flag when all jobs have completed:
chmutil version: 0.8.0
Tiles: 512x512 with 0x0 overlap
Disable histogram equalization in CHM: True
Tasks: 50 tiles per task, 1 tasks(s) per node
Trained CHM model: /data/<USER>/testchm/model
CHM binary: /data/<USER>/chm_s22.img
CHM tasks: 100% complete (2 of 2 completed)
Merge tasks: 100% complete (2 of 2 completed)
All jobs completed. Have a nice day!
Upon successful run final results will be located under chmrun/chmrun/probmaps here:
/data/<USER>/testchm/chmrun/chmrun/probmaps
Example:
ls /data/<USER>/testchm/chmrun/chmrun/probmaps
bubbles_gray2.png bubbles_gray.png