-
Notifications
You must be signed in to change notification settings - Fork 2
NEURAL_NETWORKS
TYehan edited this page Apr 5, 2025
·
1 revision
This document explains the primary machine learning concepts demonstrated in the Neural Networks practical notebook. The practical builds a simple neural network from scratch for binary classification, illustrating both forward and backward propagation techniques.
-
Weight and Bias Initialization:
- Random weights (
w1
,w2
) and biases (b1
,b2
) are initialized for the input-to-hidden and hidden-to-output layers.
- Random weights (
-
Activation Functions:
- Sigmoid: Applied to the output layer to generate predictions.
- ReLU: Used in the hidden layer to introduce non-linearity.
-
Derivative Functions:
- Implementations for the derivative of the sigmoid and ReLU functions enable the calculation of gradients during backpropagation.
-
Hidden Layer Computation:
- Compute pre-activations:
z1 = np.dot(x, w1) + b1
. - Apply the ReLU activation function to obtain the hidden layer activation (
a1
).
- Compute pre-activations:
-
Output Layer Computation:
- Compute pre-activations:
z2 = np.dot(a1, w2) + b2
. - Apply the Sigmoid activation to produce the final output (
a2
).
- Compute pre-activations:
-
Loss Calculation:
- The network uses Mean Squared Error (MSE) to compute the loss between predictions and true values.
-
Gradient Computation:
- Gradients are computed for each layer by applying the chain rule, using the derivatives of the activation functions.
-
Parameter Updates:
- Weights and biases are updated using gradient descent with a specified learning rate over multiple epochs to reduce the loss.
-
Epochs and Learning Rate:
- The model is trained for a set number of epochs (e.g., 1000) with a fixed learning rate.
-
Monitoring Convergence:
- The training loop prints the loss every 100 epochs to track the network's performance.
-
Final Predictions:
- After training, the model computes predictions for the input data. The outputs are compared to the actual values to validate the performance.
The example is about inputing 4 pairs of numbers and getting the XOR output. The model is trained to learn the XOR function, which is a classic problem in neural networks due to its non-linearity.
This practical notebook serves as a comprehensive example of constructing and training a simple neural network, demonstrating key aspects of forward propagation, backpropagation, and optimization using gradient descent.