Prerequisites: Q-Learning technique
SARSA algorithm is a slight variation of the popular Q-Learning algorithm. For a learning agent in any Reinforcement Learning algorithm it’s policy can be of two types:-
- On Policy: In this, the learning agent learns the value function according to the current action derived from the policy currently being used.
- Off Policy: In this, the learning agent learns the value function according to the action derived from another policy.
Q-Learning technique is an Off Policy technique and uses the greedy approach to learn the Q-value. SARSA technique, on the other hand, is an On Policy and uses the action performed by the current policy to learn the Q-value.
This difference is visible in the difference of the update statements for each technique:-
Here, the update equation for SARSA depends on the current state, current action, reward obtained, next state and next action. This observation lead to the naming of the learning technique as SARSA stands for State Action Reward State Action which symbolizes the tuple (s, a, r, s’, a’).
The following Python code demonstrates how to implement the SARSA algorithm using the OpenAI’s gym module to load the environment.
Step 1: Importing the required libraries
Step 2: Building the environment
Here, we will be using the ‘FrozenLake-v0’ environment which is preloaded into gym. You can read about the environment description here.
Step 3: Initializing different parameters
Step 4: Defining utility functions to be used in the learning process
Step 5: Training the learning agent
In the above output, the red mark determines the current position of the agent in the environment while the direction given in brackets gives the direction of movement that the agent will make next. Note that the agent stays at it’s position if goes out of bounds.
Step 6: Evaluating the performance
- ML | Reinforcement Learning Algorithm : Python Implementation using Q-learning
- Reinforcement learning
- Neural Logic Reinforcement Learning - An Introduction
- Epsilon-Greedy Algorithm in Reinforcement Learning
- Introduction to Thompson Sampling | Reinforcement Learning
- Genetic Algorithm for Reinforcement Learning : Python implementation
- Upper Confidence Bound Algorithm in Reinforcement Learning
- Learning Model Building in Scikit-learn : A Python Machine Learning Library
- Learning to learn Artificial Intelligence | An overview of Meta-Learning
- Difference Between Artificial Intelligence vs Machine Learning vs Deep Learning
- Introduction to Multi-Task Learning(MTL) for Deep Learning
- Artificial intelligence vs Machine Learning vs Deep Learning
- ML | Types of Learning – Supervised Learning
- How to Start Learning Machine Learning?
- Q-Learning in Python
- Machine Learning in C++
- Deep Q-Learning
- ML | Active Learning
- ML | What is Machine Learning ?
- Q-learning Mathematical Background
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to firstname.lastname@example.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.