**Prerequisites:** Q-Learning

The process of Q-Learning creates an exact matrix for the working agent which it can “refer to” to maximize its reward in the long run. Although this approach is not wrong in itself, this is only practical for very small environments and quickly loses it’s feasibility when the number of states and actions in the environment increases.

The solution for the above problem comes from the realization that the values in the matrix only have relative importance ie the values only have importance with respect to the other values. Thus, this thinking leads us to **Deep Q-Learning** which uses a deep neural network to approximate the values. This approximation of values does not hurt as long as the relative importance is preserved.

The basic working step for Deep Q-Learning is that the initial state is fed into the neural network and it returns the Q-value of all possible actions as on output.

The difference between Q-Learning and Deep Q-Learning can be illustrated as follows:-

**Pseudo Code:**

Initialize for all pairs (s,a) s = initial state k = 0 while(convergence is not achieved) { simulate action a and reach state s' if(s' is a terminal state) { target = R(s,a,s') } else { target = R(s,a,s') + } s = s' }

Observe that in the equation **target = R(s,a,s’) + **, the term

is a variable term. Therefore in this process, the target for the neural network is variable unlike other typical Deep Learning processes where the target is stationary.

This problem is overcome by having two neural networks instead of one. One neural network is used to adjust the parameters of the network and the other is used for computing the target and which has the same architecture as the first network but has frozen parameters. After an x number of iterations in the primary network, the parameters are copied to the target network.

Attention geek! Strengthen your foundations with the **Python Programming Foundation** Course and learn the basics.

To begin with, your interview preparations Enhance your Data Structures concepts with the **Python DS** Course.

## Recommended Posts:

- copy in Python (Deep Copy and Shallow Copy)
- Introduction to Deep Learning
- Introduction to Multi-Task Learning(MTL) for Deep Learning
- ML | Natural Language Processing using Deep Learning
- Deep Learning | Introduction to Long Short Term Memory
- Artificial intelligence vs Machine Learning vs Deep Learning
- Deep Learning with PyTorch | An Introduction
- NLP | Flattening Deep Tree
- Deep Neural net with forward and back propagation from scratch - Python
- Prediction of Wine type using Deep Learning
- Avengers Endgame and Deep learning | Image Caption Generation using the Avengers EndGames Characters
- Implementing Deep Q-Learning using Tensorflow
- Differential Privacy and Deep Learning
- Deep Face Recognition
- Human Activity Recognition - Using Deep Learning Model
- ML - List of Deep Learning Layers
- Residual Networks (ResNet) - Deep Learning
- ML - Saving a Deep Learning model in Keras
- Difference Between Artificial Intelligence vs Machine Learning vs Deep Learning
- DeepPose: Human Pose Estimation via Deep Neural Networks

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.