The website developed for this work is available at Exploring Fragrance Space using Generative Models.
The paper is available on arXiv at Navigating the Fragrance space Via Graph Generative Models And Predicting Odors.
Creating environment:
conda create --name my_env --file requirements.txt
The folder data contains the results of different statistical tests on the distribution of generated molecules, organized by model in sub-folders named ARGA, ARGVA, diffusion, GAE, Transformer, and VGAE. Each of these model folders includes GDB Criteria
, KS_test
, and rule of three
.
The folder named curated dataset
in data has the training data for the models and is available in two formats: curated_GS_LF_merged_cleaned.csv
(CSV) and cleaned_frag_pyg_dataset.pth
(PyTorch).
The above mentioned PyTorch data is created using: Pyg_data_creator_for_cleaned.ipynb
.
Example notebooks for model training and fine tuning are available in models.
Unlike traditional approaches, we not only generate molecules but also predict the odor likeliness and classify probable odor labels. We show that odor likeliness is a function of physicochemical features. Additionally, we identify the most relevant features to construct an odor likeliness equation and leverage SHAP (SHapley Additive exPlanations) to demonstrate the interpretability of the work.
All the figures used in the paper are available in the figures folder.
Mrityunjay Sharma, CSIR-CSIO, Chandigarh, India
Sarabeshwar Balaji, Indian Institute of Science Education and Research Bhopal(IISERB), India
Pinaki Saha, University of Hertfordshire, UH Biocomputation Group, United Kingdom
Ritesh Kumar, CSIR-CSIO, Chandigarh, India
To cite this work, please use this bibtex entry:
@article{sharma2025navigating,
title={Navigating the Fragrance Space Using Graph Generative Models and Predicting Odors},
author={Sharma, Mrityunjay and Balaji, Sarabeshwar and Saha, Pinaki and Kumar, Ritesh},
journal={Journal of Chemical Information and Modeling},
year={2025},
publisher={ACS Publications}
}