Markov Chain

Last Updated : 03 Dec, 2021

Markov chains, named after Andrey Markov, a stochastic model that depicts a sequence of possible events where predictions or probabilities for the next state are based solely on its previous event state, not the states before. In simple words, the probability that n+1^th steps will be x depends only on the nth steps not the complete sequence of steps that came before n. This property is known as Markov Property or Memorylessness. Let us explore our Markov chain with the help of a diagram,

are Markov Process

A diagram representing a two-state(here, E and A) Markov process. Here the arrows originated from the current state and point to the future state and the number associated with the arrows indicates the probability of the Markov process changing from one state to another state. For instance, if the Markov process is in state E, then the probability it changes to state A is 0.7, while the probability it remains in the same state is 0.3. Similarly, for any process in state A, the probability to change to Estate is 0.4 and the probability to remain in the same state is 0.6.

How to Represent Markov Chain?

From the diagram of the two-state Markov process, we can understand that the Markov chain is a directed graph. So we can represent it with the help of an adjacency matrix.

+——+——+

| A | E | — Each element denotes the probability weight of the edge

+——+——+——+ connecting the two corresponding vertices

| A | 0.6 | 0.4 | — 0.4 is the probability for state A to go to state E and 0.6 is the probability

+——+——+——+ to remain at the same state

| E | 0.7 | 0.3 | — 0.7 is the probability for state E to go to state A and 0.3 is the probability

+——+——+——+ to remain at the same state

This matrix is also called Transition Matrix. If the Markov chain has N possible states, the matrix will be an NxN matrix. Each row of this matrix should sum to 1. In addition to this, a Markov chain also has an Initial State Vector of order Nx1. These two entities are a must to represent a Markov chain.

N-step Transition Matrix: Now let us learn higher-order transition matrices. It helps us to find the chance of that transition occurring over multiple steps. To put in simple words, what will be the probability of moving from state A to state E over the N step? There is actually a very simple way to calculate it. This can be determined by calculating the value of entry (A, ) of the matrix obtained by raising the transition matrix to the power of N.

Types of Markov Chain :

discrete-time Markov chains : This implies the index set T( state of the process at time t ) is a countable set here or we can say that changes occur at specific states. Generally, the term “Markov chain” is used for DTMC.

continuous-time Markov chains: Here the index set T( state of the process at time t ) is a continuum, which means changes are continuous in CTMC.

Properties of Markov Chain :

A Markov chain is said to be Irreducible if we can go from one state to another in a single or more than one step.
A state in a Markov chain is said to be Periodic if returning to it requires a multiple of some integer larger than 1, the greatest common divisor of all the possible return path lengths will be the period of that state.
A state in a Markov chain is said to be Transient if there is a non-zero probability that the chain will never return to the same state, otherwise, it is Recurrent.
A state in a Markov chain is called Absorbing if there is no possible way to leave that state. Absorbing states do not have any outgoing transitions from it.

Markov Chain in Python :

Python3

# let's import our library
import scipy.linalg
import numpy as np
 
 
# Encoding this states to numbers as it
# is easier to deal with numbers instead 
# of words.
state = ["A", "E"]
 
# Assigning the transition matrix to a variable
# i.e a numpy 2d matrix.
MyMatrix = np.array([[0.6, 0.4], [0.7, 0.3]])
 
# Simulating a random walk on our Markov chain 
# with 20 steps. Random walk simply means that
# we start with an arbitrary state and then we
# move along our markov chain.
n = 20
 
# decide which state to start with
StartingState = 0
CurrentState = StartingState
 
# printing the stating state using state
# dictionary
print(state[CurrentState], "--->", end=" ")
 
while n-1:
    # Deciding the next state using a random.choice()
    # function,that takes list of states and the probability
    # to go to the next states from our current state
    CurrentState = np.random.choice([0, 1], p=MyMatrix[CurrentState])
     
    # printing the path of random walk
    print(state[CurrentState], "--->", end=" ")
    n -= 1
print("stop")
 
# Let us find the stationary distribution of our 
# Markov chain by Finding Left Eigen Vectors
# We only need the left eigen vectors
MyValues, left = scipy.linalg.eig(MyMatrix, right=False, left=True)
 
print("left eigen vectors = \n", left, "\n")
print("eigen values = \n", MyValues)
 
# Pi is a probability distribution so the sum of 
# the probabilities should be 1 To get that from 
# the above negative values we just have to normalize
pi = left[:, 0]
pi_normalized = [(x/np.sum(pi)).real for x in pi]
pi_normalized

Application of Markov Chain :

Markov chains make the study of many real-world processes much more simple and easy to understand. Using the Markov chain we can derive some useful results such as Stationary Distribution and many more.

MCMC(Markov Chain Monte Carlo), which gives a solution to the problems that come from the normalization factor, is based on Markov Chain.
Markov Chains are used in information theory, search engines, speech recognition etc.
Markov chain has huge possibilities, future and importance in the field of Data Science and the interested readers are requested to learn this stuff properly for being a competent person in the field of Data Science.

Assumptions for Markov Chain :

The statistical system contains a finite number of states.
The states are mutually exclusive and collectively exhaustive.
The transition probability from one state to another state is constant over time.

Markov processes are fairly common in real-life problems and Markov chains can be easily implemented because of their memorylessness property. Using Markov chain can simplify the problem without affecting its accuracy.

Let us take an example to understand the advantage of this tool, suppose my friend is suggesting to have a meal. I may say that I do not want a pizza as I have that one hour ago. But Is it appropriate if I say that I do not want a pizza because I have it two months ago? That means in this case, my probability of picking a meal is entirely dependent on my immediately preceding meal. Here is the effectiveness of the Markov Chain.

Suggest improvement

Application of Partial Derivative – Two variable Maxima and Minima

Constraint Cubic spline

Share your thoughts in the comments