Implementing reinforcement learning for maze solving. The maze is randomly generated each time and a Q-Learning agent is trained for 10 minutes to solve the task.
The following methods are implemented:
- Deep Q network
- Dueling Q network
- Target network
- Double Q learning
- Experience replay buffer
- Prioritised action replay
- Decaying epsilon greedy exploration