Machine learning and image processing tools for particle classification in the CONNIE experiment using Skipper-CCD sensor data.
This repository contains tools and models developed to classify particle tracks captured in the CONNIE (Coherent Neutrino-Nucleus Interaction Experiment) detector. The classification focuses on identifying events such as muons, electrons, blobs, diffusion hits, alphas, and others, using both image-based and feature-based machine learning techniques.
The main objective is to support the detection and study of coherent elastic neutrino-nucleus scattering (CEνNS) by improving the signal-to-background separation in CONNIE’s Skipper-CCD images.
- 📦 Event cropping and preprocessing from Skipper-CCD energy-calibrated FITS images
- 🧠 Convolutional Neural Network (CNN) classifier (ResNet-18) trained on labeled and augmented event images
- 🌲 Feature-based models (Random Forest, XGBoost) using event metadata extracted from ROOT catalogs
- 🧪 Cross-validation, grid search, and accuracy benchmarking
- 🖼️ GUI for human labeling with Annotation Redundancy & Quality Assurance
- 📊 Evaluation tools and confusion matrix reporting
├── data/ # Event images and raw data
├── database # Database related files
├── models/ # Trained model checkpoints
├── scripts/ # Image processing and training scripts
├── gui/ # Annotation GUI tools
├── notebooks/ # Jupyter notebooks for exploration and analysis
├── results/ # Evaluation results and figures
├── README.md
- Python 3.8+
- PyTorch
- OpenCV
- NumPy, Matplotlib, Scikit-learn
- XGBoost, ROOT, uproot
- Tkinter (for GUI annotation tool)
git clone https://github.com/yourusername/connie-particle-classifier.git
cd connie-particle-classifier
pip install -r requirements.txt
You can also create a virtual environment
git clone https://github.com/yourusername/connie-particle-classifier.git
cd connie-particle-classifier
create_venv.sh
source virtualenv/bin/activate
python scripts/extract_events.py --input_folder raw_images/ --output_folder data/events/
python scripts/train_cnn_credo.py
python gui/data_label.py
- Raw Data: Skipper-CCD images in FITS format from CONNIE Run 125
- Labeled Subset: Annotated events (PNG and ROOT) from Runs 118 and 125
- External: CREDO dataset used for transfer learning experimentation
Note: Due to collaboration restrictions, some datasets may not be publicly available.
- CNN classification accuracy on test set (CREDO dataset): ~95%
- Feature-based classifiers (XGBoost, RF) with CONNIE dataset: ~88–89%
- Manual annotation strategy shows improved reliability using redundancy
This project is licensed under the MIT License. See LICENSE for details.
Developed as part of research with the CONNIE Collaboration.
Special thanks to collaborators from UNICAMP, UFRJ, and associated institutions.
Sara Mirthis Dantas dos Santos
Dept. of Computer Engineering and Automation (DCA)
Universidade Estadual de Campinas (UNICAMP)
s224018@dac.unicamp.br