Skip to content

Ishaan-Ansari/Deep-Learning-from-scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

20 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Deep Learning from Scratch ๐Ÿง 

Welcome to Deep Learning from Scratch, a repository where I implement fundamental deep learning architectures from scratch using Python, NumPy, PyTorch, and TensorFlow. This project aims to provide a deeper understanding of how neural networks function internally, without relying on high-level libraries.

Contents

1. Foundational Deep Neural Networks

Papers

  • DNN (1987): Learning Internal Representations by Error Propagation pdf
  • CNN (1989): Backpropagation Applied to Handwritten Zip Code Recognition pdf
  • LeNet (1998): Gradient-Based Learning Applied to Document Recognition pdf
  • AlexNet (2012): ImageNet Classification with Deep Convolutional Networks pdf
  • U-Net (2015): Convolutional Networks for Biomedical Image Segmentation pdf

2. Optimization and Regularization Techniques

Papers

  • Weight Decay (1991): A Simple Weight Decay Can Improve Generalization pdf
  • ReLU (2011): Deep Sparse Rectified Neural Networks pdf
  • Residuals (2015): Deep Residual Learning for Image Recognition pdf
  • Dropout (2014): Preventing Neural Networks from Overfitting pdf
  • BatchNorm (2015): Accelerating Deep Network Training pdf
  • LayerNorm (2016): Layer Normalization pdf
  • GELU (2016): Gaussian Error Linear Units pdf
  • Adam (2014): Stochastic Optimization Method pdf

3. Sequence Modeling

Papers

  • RNN (1989): Continually Running Fully Recurrent Neural Networks pdf
  • LSTM (1997): Long-Short Term Memory pdf
  • Learning to Forget (2000): Continual Prediction with LSTM pdf
  • Word2Vec (2013): Word Representations in Vector Space pdf
  • Phrase2Vec (2013): Distributed Representations of Words and Phrases pdf
  • Encoder-Decoder (2014): RNN Encoder-Decoder for Machine Translation pdf
  • Seq2Seq (2014): Sequence to Sequence Learning pdf
  • Attention (2014): Neural Machine Translation with Alignment pdf
  • Mixture of Experts (2017): Sparsely-Gated Neural Networks pdf

4. Language Modeling

Papers

  • Transformer (2017): Attention Is All You Need pdf
  • BERT (2018): Bidirectional Transformers for Language Understanding pdf
  • RoBERTa (2019): Robustly Optimized BERT Pretraining pdf
  • T5 (2019): Unified Text-to-Text Transformer pdf
  • GPT Series:
    • GPT (2018): Generative Pre-Training pdf
    • GPT-2 (2018): Unsupervised Multitask Learning pdf
    • GPT-3 (2020): Few-Shot Learning pdf
    • GPT-4 (2023): Advanced Language Model pdf
  • LoRA (2021): Low-Rank Adaptation of Large Language Models pdf
  • RLHF (2019): Fine-Tuning from Human Preferences pdf
  • InstructGPT (2022): Following Instructions with Human Feedback pdf
  • Vision Transformer (2020): Image Recognition with Transformers pdf
  • ELECTRA (2020): Discriminative Pre-training pdf

5. Image Generative Modeling

Papers

  • GAN (2014): Generative Adversarial Networks pdf
  • VAE (2013): Auto-Encoding Variational Bayes pdf
  • VQ VAE (2017): Neural Discrete Representation Learning pdf
  • Diffusion Models:
    • Initial Diffusion (2015): Nonequilibrium Thermodynamics pdf
    • Denoising Diffusion (2020): Probabilistic Models pdf
    • Improved Denoising Diffusion (2021) pdf
  • CLIP (2021): Visual Models from Natural Language Supervision pdf
  • DALL-E (2021-2022): Text-to-Image Generation pdf
  • SimCLR (2020): Contrastive Learning of Visual Representations pdf

6. Deep Reinforcement Learning

Papers

  • Deep Reinforcement Learning (2017): Mastering Chess and Shogi pdf
  • Deep Q-Learning (2013): Playing Atari Games pdf
  • AlphaGo (2016): Mastering the Game of Go pdf
  • AlphaFold (2021): Protein Structure Prediction pdf

7. Additional Influential Papers

  • Deep Learning Survey (2015): By LeCun, Bengio, and Hinton pdf
  • BigGAN (2018): Large Scale GAN Training pdf
  • WaveNet (2016): Generative Model for Raw Audio pdf
  • BERTology (2020): Survey of BERT Use Cases pdf

Scaling and Model Optimization

  • Scaling Laws for Neural Language Models (2020): Predicting Model Performance pdf
  • Chinchilla (2022): Training Compute-Optimal Large Language Models pdf
  • Gopher (2022): Scaling Language Models with Massive Compute pdf

Fine-tuning and Adaptation

  • P-Tuning (2021): Prompt Tuning with Soft Prompts pdf
  • Prefix-Tuning (2021): Optimizing Continuous Prompts pdf
  • AdaLoRA (2023): Adaptive Low-Rank Adaptation pdf
  • QLoRA (2023): Efficient Fine-Tuning of Quantized Models pdf

Inference and Optimization Techniques

  • FlashAttention (2022): Fast and Memory-Efficient Attention pdf
  • FlashAttention-2 (2023): Faster Attention Mechanism pdf
  • Direct Preference Optimization (DPO) (2023): Aligning Language Models with Human Preferences pdf
  • LoRA (2021): Low-Rank Adaptation of Large Language Models pdf

Pre-training and Model Architecture

  • Mixture of Experts (MoE) (2022): Scaling Language Models with Sparse Experts pdf
  • GLaM (2021): Efficient Scaling with Mixture of Experts pdf
  • Switch Transformers (2022): Scaling to Trillion Parameter Models pdf

Reasoning and Capabilities

  • Chain of Thought Prompting (2022): Reasoning with Language Models pdf
  • Self-Consistency (2022): Improving Language Model Reasoning pdf
  • Tree of Thoughts (2023): Deliberate Problem Solving pdf

Efficiency and Compression

  • DistilBERT (2019): Distilled Version of BERT pdf
  • Knowledge Distillation (2022): Comprehensive Survey pdf
  • Pruning and Quantization Techniques (2022): Model Compression Survey pdf

๐Ÿ› ๏ธ How to Use

  1. Clone the repository:
    git clone https://github.com/Ishaan-Ansari/Deep-Learning-from-scratch.git
  2. Navigate to a specific model:
    cd Deep-Learning-from-scratch/[Folder_Name]
  3. Run Jupyter notebooks:
    jupyter notebook
  4. Follow the instructions within each notebook.

๐Ÿ“Œ Contributions & Feedback

This project is a work in progress! If you have suggestions, feel free to fork the repo, submit issues, or create pull requests.

โญ If you find this helpful, star this repository and stay tuned for more updates!


This keeps it clean, structured, and informative. Let me know if you need modifications! ๐Ÿš€

About

Implementing deep learning architectures from scratch

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published