Deep Generative Modeling for Cosmological Fields

This repository presents a modular, research-level pipeline for simulating, compressing, and performing rigorous likelihood-based inference on high-dimensional Gaussian Random Fields (GRFs). The approach leverages state-of-the-art deep learning techniques—autoencoders and normalizing flows (RealNVP)—to enable scientific parameter estimation in cosmology and beyond.

Project Overview

Modern cosmological analysis often involves high-dimensional data (such as cosmic fields or sky maps) where direct likelihood evaluation is intractable. This project demonstrates how to:

Generate synthetic cosmological fields with controlled parameters,
Compress them using deep convolutional autoencoders into informative latent representations,
Learn flexible probabilistic models (normalizing flows) for these latent spaces,
Perform explicit, likelihood-based inference on the original physical parameters.

The resulting pipeline not only enables parameter recovery from simulated data but also provides a framework for deploying these techniques on real astronomical observations or other scientific fields.

new_autoencoder.ipynb
- End-to-end notebook for synthetic GRF data generation, autoencoder model training, validation, and visualization of reconstructions.
- Modular code blocks for adjusting model architecture and dataset parameters.
NVP_Flow:_conditional.ipynb
- Implements conditional RealNVP flows in the autoencoder latent space.
- Enables likelihood evaluation and parameter inference as a function of cosmological parameters.
- Includes visualization routines for likelihood surfaces and parameter posteriors.
NVP_flow.ipynb
- Standalone demo of RealNVP on toy problems (e.g., two moons) to build intuition before applying to scientific data.
grf_autoencoder.pth
- Pretrained model weights for the autoencoder, ready for immediate use or fine-tuning.
figs/
- Collection of sample figures: reconstructions, loss curves, flow samples, and likelihood maps.

Key Strengths

Synthetic Data with Physical Motivation:
The code generates GRFs from parametric power spectra, mimicking real cosmological field statistics. This allows controlled benchmarking of inference pipelines.
Modular Deep Learning Components:
Easily swap or extend model architectures. Autoencoders are built for flexibility (depth, bottleneck size, activation), supporting rapid experimentation.
Explicit Likelihood Evaluation:
Conditional normalizing flows provide tractable likelihoods in compressed spaces—enabling rigorous, Bayesian-like parameter estimation.
Visualization & Diagnostics:
Notebooks include clear visual outputs: sample fields, reconstructions, latent space structure, flow-generated samples, and likelihood surfaces over parameter space.
Fully Reproducible & Extensible:
Synthetic data generation is integrated—no external datasets required. All hyperparameters and random seeds can be controlled for reproducibility.

How to Use

Clone the repository and install dependencies:

pip install numpy scipy matplotlib torch scikit-learn tqdm

Run new_autoencoder.ipynb to:
- Generate synthetic GRF data,
- Train the convolutional autoencoder,
- Visualize reconstructions and training curves.
Run NVP_Flow:_conditional.ipynb to:
- Train a conditional normalizing flow in the latent space,
- Evaluate and visualize likelihood surfaces over cosmological parameter space.
(Optional) Run NVP_flow.ipynb to experiment with RealNVP on toy datasets for intuition-building.
Browse the figs/ directory to see generated figures and outcomes.

The pipeline is designed to be run out-of-the-box; if data files are missing, they will be generated automatically.

Repository Structure

├── figs/
│   └── [Generated figures: reconstruction plots, likelihood maps, etc.]
├── grf_autoencoder.pth
├── new_autoencoder.ipynb
├── NVP_Flow:_conditional.ipynb
├── NVP_flow.ipynb
├── requirements.txt

Future Improvements

More expressive generative models:
Add deeper or alternative architectures (e.g., ResNet-based autoencoders, GLOW/MAF/Neural Spline Flows).
Uncertainty quantification:
Overlay credible contours on likelihood maps; integrate full Bayesian posteriors.
Application to real datasets:
Adapt data pipeline for telescope or simulation outputs; add data augmentation or physical systematics.
Experiment tracking & reproducibility:
Integrate MLflow or Weights & Biases for experiment management and hyperparameter sweeps.
Comprehensive testing:
Add unit tests for all model components and utility functions.

Contact

Mohammad Farhan Hassan
hassan.farhan7777@gmail.com

This project demonstrates the intersection of deep generative modeling and scientific inference—showcasing modern ML techniques applied to synthetic cosmological data, with extensibility to many domains in science and engineering.

Feel free to fork this repository or open issues for questions, suggestions, or collaborations!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Deep Generative Modeling for Cosmological Fields

Project Overview

Contents

Key Strengths

How to Use

Repository Structure

Future Improvements

Contact

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.ipynb_checkpoints		.ipynb_checkpoints
figs		figs
NVP_Flow:_conditional.ipynb		NVP_Flow:_conditional.ipynb
NVP_flow.ipynb		NVP_flow.ipynb
README.md		README.md
grf_autoencoder.pth		grf_autoencoder.pth
new_autoencoder.ipynb		new_autoencoder.ipynb
requirements.txt		requirements.txt

hassanfarhan777/autoencoder-normalizing-flow-cosmology

Folders and files

Latest commit

History

Repository files navigation

Deep Generative Modeling for Cosmological Fields

Project Overview

Contents

Key Strengths

How to Use

Repository Structure

Future Improvements

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages