Welcome to Deep Learning from Scratch, a repository where I implement fundamental deep learning architectures from scratch using Python, NumPy, PyTorch, and TensorFlow. This project aims to provide a deeper understanding of how neural networks function internally, without relying on high-level libraries.
- DNN (1987): Learning Internal Representations by Error Propagation pdf
- CNN (1989): Backpropagation Applied to Handwritten Zip Code Recognition pdf
- LeNet (1998): Gradient-Based Learning Applied to Document Recognition pdf
- AlexNet (2012): ImageNet Classification with Deep Convolutional Networks pdf
- U-Net (2015): Convolutional Networks for Biomedical Image Segmentation pdf
- Weight Decay (1991): A Simple Weight Decay Can Improve Generalization pdf
- ReLU (2011): Deep Sparse Rectified Neural Networks pdf
- Residuals (2015): Deep Residual Learning for Image Recognition pdf
- Dropout (2014): Preventing Neural Networks from Overfitting pdf
- BatchNorm (2015): Accelerating Deep Network Training pdf
- LayerNorm (2016): Layer Normalization pdf
- GELU (2016): Gaussian Error Linear Units pdf
- Adam (2014): Stochastic Optimization Method pdf
- RNN (1989): Continually Running Fully Recurrent Neural Networks pdf
- LSTM (1997): Long-Short Term Memory pdf
- Learning to Forget (2000): Continual Prediction with LSTM pdf
- Word2Vec (2013): Word Representations in Vector Space pdf
- Phrase2Vec (2013): Distributed Representations of Words and Phrases pdf
- Encoder-Decoder (2014): RNN Encoder-Decoder for Machine Translation pdf
- Seq2Seq (2014): Sequence to Sequence Learning pdf
- Attention (2014): Neural Machine Translation with Alignment pdf
- Mixture of Experts (2017): Sparsely-Gated Neural Networks pdf
- Transformer (2017): Attention Is All You Need pdf
- BERT (2018): Bidirectional Transformers for Language Understanding pdf
- RoBERTa (2019): Robustly Optimized BERT Pretraining pdf
- T5 (2019): Unified Text-to-Text Transformer pdf
- GPT Series:
- LoRA (2021): Low-Rank Adaptation of Large Language Models pdf
- RLHF (2019): Fine-Tuning from Human Preferences pdf
- InstructGPT (2022): Following Instructions with Human Feedback pdf
- Vision Transformer (2020): Image Recognition with Transformers pdf
- ELECTRA (2020): Discriminative Pre-training pdf
- GAN (2014): Generative Adversarial Networks pdf
- VAE (2013): Auto-Encoding Variational Bayes pdf
- VQ VAE (2017): Neural Discrete Representation Learning pdf
- Diffusion Models:
- CLIP (2021): Visual Models from Natural Language Supervision pdf
- DALL-E (2021-2022): Text-to-Image Generation pdf
- SimCLR (2020): Contrastive Learning of Visual Representations pdf
- Deep Reinforcement Learning (2017): Mastering Chess and Shogi pdf
- Deep Q-Learning (2013): Playing Atari Games pdf
- AlphaGo (2016): Mastering the Game of Go pdf
- AlphaFold (2021): Protein Structure Prediction pdf
- Deep Learning Survey (2015): By LeCun, Bengio, and Hinton pdf
- BigGAN (2018): Large Scale GAN Training pdf
- WaveNet (2016): Generative Model for Raw Audio pdf
- BERTology (2020): Survey of BERT Use Cases pdf
- Scaling Laws for Neural Language Models (2020): Predicting Model Performance pdf
- Chinchilla (2022): Training Compute-Optimal Large Language Models pdf
- Gopher (2022): Scaling Language Models with Massive Compute pdf
- P-Tuning (2021): Prompt Tuning with Soft Prompts pdf
- Prefix-Tuning (2021): Optimizing Continuous Prompts pdf
- AdaLoRA (2023): Adaptive Low-Rank Adaptation pdf
- QLoRA (2023): Efficient Fine-Tuning of Quantized Models pdf
- FlashAttention (2022): Fast and Memory-Efficient Attention pdf
- FlashAttention-2 (2023): Faster Attention Mechanism pdf
- Direct Preference Optimization (DPO) (2023): Aligning Language Models with Human Preferences pdf
- LoRA (2021): Low-Rank Adaptation of Large Language Models pdf
- Mixture of Experts (MoE) (2022): Scaling Language Models with Sparse Experts pdf
- GLaM (2021): Efficient Scaling with Mixture of Experts pdf
- Switch Transformers (2022): Scaling to Trillion Parameter Models pdf
- Chain of Thought Prompting (2022): Reasoning with Language Models pdf
- Self-Consistency (2022): Improving Language Model Reasoning pdf
- Tree of Thoughts (2023): Deliberate Problem Solving pdf
- DistilBERT (2019): Distilled Version of BERT pdf
- Knowledge Distillation (2022): Comprehensive Survey pdf
- Pruning and Quantization Techniques (2022): Model Compression Survey pdf
- Clone the repository:
git clone https://github.com/Ishaan-Ansari/Deep-Learning-from-scratch.git
- Navigate to a specific model:
cd Deep-Learning-from-scratch/[Folder_Name]
- Run Jupyter notebooks:
jupyter notebook
- Follow the instructions within each notebook.
This project is a work in progress! If you have suggestions, feel free to fork the repo, submit issues, or create pull requests.
โญ If you find this helpful, star this repository and stay tuned for more updates!
This keeps it clean, structured, and informative. Let me know if you need modifications! ๐