This is modified from the official implementation of End-to-End Probabilistic Perspective-n-Points for 6DoF object pose estimation. [epro-pnp]
The project is totally trained with virtual dataset, which is in Bop dataset format.
The inference pipline is two stages. First, yolov8 detection model crops objects from origin images. Then ERro-PnP-6DoF estimates objects poses from croped images.
The dataset is rendered by Blender and the models of objects are scanned by luma, which is a Nerf based 3D scanner app.
Example image:
The code has been tested in the environment described as follows:
- Linux (tested on Ubuntu 22.04)
- Python 3.10
- PyTorch 2.0.0
An example script for installing the python dependencies under CUDA 11.8:
# Create conda environment
conda create -y -n epropnp_python=3.10
conda activate epropnp
# Install pytorch
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
# Install other dependencies
pip install opencv-python==4.5.1.48 pyro-ppl==1.4.0 PyYAML==5.4.1 matplotlib termcolor plyfile easydict scipy progress tensorboardx ultralytics
-
Install dependencies:
-
Download dataset and model checkpoints
-
dataset - unzip and placed in dataset dir.
-
epro-pnp, yolov8 - place all checkpoints in checkpoints dir.
python train.py --cfg exps_cfg/bop/bop_no_init.yaml
./scripts/inference.sh ../imgs/test.jpeg
Poses visualization: