SARSA and Q-Learning technique in Reinforcement Learning are algorithms that uses Temporal Difference(TD) Update to improve the agent’s behaviour. Expected SARSA technique is an alternative for improving the agent’s policy. It is very similar to SARSA and Q-Learning, and differs in the action value function it follows.
We know that SARSA is an on-policy techique, Q-learning is an off-policy technique, but Expected SARSA can be use either as an on-policy or off-policy. This is where Expected SARSA is much more flexible compared to both these algorithms.
Let’s compare the action-value function of all the three algorithms and find out what is different in Expected SARSA.
- Expected SARSA:
We see that Expected SARSA takes the weighted sum of all possible next actions with respect to the probability of taking that action. If the Expected Return is greedy with respect to the expected return, then this equation gets transformed to Q-Learning. Otherwise Expected SARSA is on-policy and computes the expected return for all actions, rather than randomly selecting an action like SARSA.
Keeping the theory and the formulae in mind, let us compare all the three algorithms, with an experiment. We shall implement a Cliff Walker as our environment provided by the gym library
Code: Python code to create the class Agent which will be inherited by the other agents to avoid duplicate code.
Code: Python code to create the SARSA Agent.
Code: Python code to create the Q-Learning Agent.
Code: Python code to create the Expected SARSA Agent. In this experiment we are using the following equation for the policy.
Python code to create an environment and Test all the three algorithms.
We have seen that Expected SARSA performs reasonably well in certain problems. It considers all possible outcomes before selecting a particular action. The fact that Expected SARSA can be used either as an off or on policy, is what makes this algorithm so dynamic.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.
- SARSA Reinforcement Learning
- ML | Reinforcement Learning Algorithm : Python Implementation using Q-learning
- Reinforcement learning
- Genetic Algorithm for Reinforcement Learning : Python implementation
- Epsilon-Greedy Algorithm in Reinforcement Learning
- Introduction to Thompson Sampling | Reinforcement Learning
- Neural Logic Reinforcement Learning - An Introduction
- Upper Confidence Bound Algorithm in Reinforcement Learning
- Learning Model Building in Scikit-learn : A Python Machine Learning Library
- ML | Types of Learning – Supervised Learning
- Introduction to Multi-Task Learning(MTL) for Deep Learning
- Artificial intelligence vs Machine Learning vs Deep Learning
- Learning to learn Artificial Intelligence | An overview of Meta-Learning
- How to Start Learning Machine Learning?
- Difference Between Artificial Intelligence vs Machine Learning vs Deep Learning
- Need of Data Structures and Algorithms for Deep Learning and Machine Learning
- Machine Learning - Applications
- Demystifying Machine Learning
- Getting started with Machine Learning
- Introduction To Machine Learning using Python
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.