Skip to content

NEURAL_NETWORKS

TYehan edited this page Apr 5, 2025 · 1 revision

Core Machine Learning Concepts in the Neural Networks Practical

This document explains the primary machine learning concepts demonstrated in the Neural Networks practical notebook. The practical builds a simple neural network from scratch for binary classification, illustrating both forward and backward propagation techniques.

1. Network Initialization and Activation Functions

  • Weight and Bias Initialization:
    • Random weights (w1, w2) and biases (b1, b2) are initialized for the input-to-hidden and hidden-to-output layers.
  • Activation Functions:
    • Sigmoid: Applied to the output layer to generate predictions.
    • ReLU: Used in the hidden layer to introduce non-linearity.
  • Derivative Functions:
    • Implementations for the derivative of the sigmoid and ReLU functions enable the calculation of gradients during backpropagation.

2. Forward Propagation

  • Hidden Layer Computation:
    • Compute pre-activations: z1 = np.dot(x, w1) + b1.
    • Apply the ReLU activation function to obtain the hidden layer activation (a1).
  • Output Layer Computation:
    • Compute pre-activations: z2 = np.dot(a1, w2) + b2.
    • Apply the Sigmoid activation to produce the final output (a2).

3. Backward Propagation and Gradient Descent

  • Loss Calculation:
    • The network uses Mean Squared Error (MSE) to compute the loss between predictions and true values.
  • Gradient Computation:
    • Gradients are computed for each layer by applying the chain rule, using the derivatives of the activation functions.
  • Parameter Updates:
    • Weights and biases are updated using gradient descent with a specified learning rate over multiple epochs to reduce the loss.

4. Training Process and Evaluation

  • Epochs and Learning Rate:
    • The model is trained for a set number of epochs (e.g., 1000) with a fixed learning rate.
  • Monitoring Convergence:
    • The training loop prints the loss every 100 epochs to track the network's performance.
  • Final Predictions:
    • After training, the model computes predictions for the input data. The outputs are compared to the actual values to validate the performance.

The example is about inputing 4 pairs of numbers and getting the XOR output. The model is trained to learn the XOR function, which is a classic problem in neural networks due to its non-linearity.

This practical notebook serves as a comprehensive example of constructing and training a simple neural network, demonstrating key aspects of forward propagation, backpropagation, and optimization using gradient descent.

Clone this wiki locally