Segmentation fault #3

mhalber · 2019-02-18T20:38:31Z

Hi,
Thank you for providing this code!
I have successfully run the script to prepare the data for Scannet, however when attempting to run the training, I am sadly running into a segfault.

The console output before crash:

keyname=instance_normal_augment_2 task=train started
the number of images val 20
the number of images train 1201
the number of images 1201

Through some print statement abuse, I've managed to see that the code seems to be breaking in
function forward( self, coords, faces, colors, instances), file models/instance.py, at line 199

Python, gcc, torch, cuda versions:
Python - 3.7.2
torch - 1.0.0
cuda - 9.0.176
I am attempting to run the code on a system with Tesla K40c, with 12GB of memory

I'd greatly appreciate help in trying to figure out what is going wrong.

Thanks!

The text was updated successfully, but these errors were encountered:

chenliu-wustl · 2019-02-18T22:29:06Z

Could you please check the value range of all_coords (all_coords.min(0) and all_coords.max(0)). The all_coords should have a shape of Nx4 and all_coords.min(0)[:3] should be greater than 0, all_coords.max(0)[:3] should be smaller than 4096 and all_coords.min(0)[3] = all_coords.max(0)[3] = 0.

mhalber · 2019-02-19T15:42:29Z

Hi - thank you for your reply.

Turns out the fault has been a bit on my side - I think the issue has been due to the python version mismatch. SparseConvNet github page mentions the use of python 3.6.8, so I've switched to that version of python. Additionally, I've noticed mismatch between nvcc version and cuda version in torch on my computer.

After these two changes, the network seems to be training without issues.

I think it would be nice if README.md mentioned the required CUDA/python versions, as without SparseConvNet page I'd be lost.

Anyway, thanks again for the help and I will close the issue.

mhalber closed this as completed Feb 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmentation fault #3

Segmentation fault #3

mhalber commented Feb 18, 2019

chenliu-wustl commented Feb 18, 2019

mhalber commented Feb 19, 2019

Segmentation fault #3

Segmentation fault #3

Comments

mhalber commented Feb 18, 2019

chenliu-wustl commented Feb 18, 2019

mhalber commented Feb 19, 2019