Skip to content

VAE notebooks (copilot-generated) #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,5 @@ docs/29_algorithm_validation/ideas.ipynb
docs/29_algorithm_validation/solution for exercise - metrics to investigate segmentation results.ipynb
docs/22_feature_extraction/blobs_analysis.csv
data/S-BIAD634
docs/71_fine_tuning_hf/haesleinhuepf
docs/71_fine_tuning_hf/haesleinhuepf
docs/90_variational_auto_encoders/data
416 changes: 416 additions & 0 deletions docs/90_variational_auto_encoders/01_intro_to_vae.ipynb

Large diffs are not rendered by default.

215 changes: 215 additions & 0 deletions docs/90_variational_auto_encoders/02_vae_architecture.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,215 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# VAE Architecture\n",
"\n",
"In this notebook, we will delve into the architecture of Variational Auto-Encoders (VAEs). We will explain the components of a VAE, including the encoder and decoder, and provide code examples for building a VAE architecture using PyTorch."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Encoder\n",
"\n",
"The encoder is a neural network that takes the input data and maps it to a latent space. The output of the encoder is the parameters of a probability distribution in the latent space, typically the mean and log variance."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import torch\n",
"import torch.nn as nn\n",
"\n",
"class Encoder(nn.Module):\n",
" def __init__(self, input_dim, hidden_dim, latent_dim):\n",
" super(Encoder, self).__init__()\n",
" self.fc1 = nn.Linear(input_dim, hidden_dim)\n",
" self.fc2_mean = nn.Linear(hidden_dim, latent_dim)\n",
" self.fc2_log_var = nn.Linear(hidden_dim, latent_dim)\n",
" self.relu = nn.ReLU()\n",
" \n",
" def forward(self, x):\n",
" h = self.relu(self.fc1(x))\n",
" mean = self.fc2_mean(h)\n",
" log_var = self.fc2_log_var(h)\n",
" return mean, log_var"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Decoder\n",
"\n",
"The decoder is a neural network that takes samples from the latent distribution and maps them back to the original data space. The output of the decoder is the reconstructed data."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"class Decoder(nn.Module):\n",
" def __init__(self, latent_dim, hidden_dim, output_dim):\n",
" super(Decoder, self).__init__()\n",
" self.fc1 = nn.Linear(latent_dim, hidden_dim)\n",
" self.fc2 = nn.Linear(hidden_dim, output_dim)\n",
" self.relu = nn.ReLU()\n",
" self.sigmoid = nn.Sigmoid()\n",
" \n",
" def forward(self, z):\n",
" h = self.relu(self.fc1(z))\n",
" x_reconstructed = self.sigmoid(self.fc2(h))\n",
" return x_reconstructed"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## VAE Model\n",
"\n",
"The VAE model combines the encoder and decoder, and includes a sampling layer to sample from the latent distribution."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"class VAE(nn.Module):\n",
" def __init__(self, encoder, decoder):\n",
" super(VAE, self).__init__()\n",
" self.encoder = encoder\n",
" self.decoder = decoder\n",
" \n",
" def forward(self, x):\n",
" mean, log_var = self.encoder(x)\n",
" std = torch.exp(0.5 * log_var)\n",
" epsilon = torch.randn_like(std)\n",
" z = mean + std * epsilon\n",
" x_reconstructed = self.decoder(z)\n",
" return x_reconstructed, mean, log_var"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Loss Function\n",
"\n",
"The loss function for a VAE consists of two terms: the reconstruction loss and the KL divergence. The reconstruction loss measures how well the decoder can reconstruct the input data from the latent space, while the KL divergence measures how close the learned distribution is to a prior distribution."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"def vae_loss(x, x_reconstructed, mean, log_var):\n",
" reconstruction_loss = nn.functional.binary_cross_entropy(x_reconstructed, x, reduction='sum')\n",
" kl_divergence = -0.5 * torch.sum(1 + log_var - mean.pow(2) - log_var.exp())\n",
" return reconstruction_loss + kl_divergence"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Training the VAE\n",
"\n",
"Let's train the VAE on a simple dataset, such as the MNIST dataset."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Epoch 1, Loss: 182.64177789713543\n",
"Epoch 2, Loss: 164.48488276367186\n",
"Epoch 3, Loss: 161.2271118815104\n",
"Epoch 4, Loss: 159.09910853678386\n",
"Epoch 5, Loss: 157.53306800944011\n",
"Epoch 6, Loss: 156.28730290527344\n",
"Epoch 7, Loss: 155.26126284179688\n",
"Epoch 8, Loss: 154.44241954752604\n",
"Epoch 9, Loss: 153.73177485351562\n",
"Epoch 10, Loss: 153.18850033365885\n"
]
}
],
"source": [
"from torchvision import datasets, transforms\n",
"import torch.optim as optim\n",
"\n",
"# Load the MNIST dataset\n",
"transform = transforms.Compose([transforms.ToTensor(), transforms.Lambda(lambda x: x.view(-1))])\n",
"train_dataset = datasets.MNIST(root='./data', train=True, transform=transform, download=True)\n",
"test_dataset = datasets.MNIST(root='./data', train=False, transform=transform, download=True)\n",
"train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=64, shuffle=True)\n",
"test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=64, shuffle=False)\n",
"\n",
"# Define the VAE model\n",
"input_dim = 28 * 28\n",
"hidden_dim = 256\n",
"latent_dim = 2\n",
"encoder = Encoder(input_dim, hidden_dim, latent_dim)\n",
"decoder = Decoder(latent_dim, hidden_dim, input_dim)\n",
"vae = VAE(encoder, decoder)\n",
"\n",
"# Define the optimizer\n",
"optimizer = optim.Adam(vae.parameters(), lr=1e-3)\n",
"\n",
"# Train the model\n",
"num_epochs = 10\n",
"for epoch in range(num_epochs):\n",
" vae.train()\n",
" train_loss = 0\n",
" for x, _ in train_loader:\n",
" optimizer.zero_grad()\n",
" x_reconstructed, mean, log_var = vae(x)\n",
" loss = vae_loss(x, x_reconstructed, mean, log_var)\n",
" loss.backward()\n",
" train_loss += loss.item()\n",
" optimizer.step()\n",
" print(f'Epoch {epoch + 1}, Loss: {train_loss / len(train_loader.dataset)}')"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.9"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Loading