Markov chains, named after Andrey Markov, a stochastic model that depicts a sequence of possible events where predictions or probabilities for the next state are based solely on its previous event state, not the states before. In simple words, the probability that n+1th steps will be x depends only on the nth steps not the complete sequence of steps that came before n. This property is known as Markov Property or Memorylessness. Let us explore our Markov chain with the help of a diagram,
A diagram representing a two-state(here, E and A) Markov process. Here the arrows originated from the current state and point to the future state and the number associated with the arrows indicates the probability of the Markov process changing from one state to another state. For instance, if the Markov process is in state E, then the probability it changes to state A is 0.7, while the probability it remains in the same state is 0.3. Similarly, for any process in state A, the probability to change to Estate is 0.4 and the probability to remain in the same state is 0.6.
Attention reader! Don’t stop learning now. Get hold of all the important Machine Learning Concepts with the Machine Learning Foundation Course at a student-friendly price and become industry ready.
How to Represent Markov Chain?
From the diagram of the two-state Markov process, we can understand that the Markov chain is a directed graph. So we can represent it with the help of an adjacency matrix.
| A | E | — Each element denotes the probability weight of the edge
+——+——+——+ connecting the two corresponding vertices
| A | 0.6 | 0.4 | — 0.4 is the probability for state A to go to state E and 0.6 is the probability
+——+——+——+ to remain at the same state
| E | 0.7 | 0.3 | — 0.7 is the probability for state E to go to state A and 0.3 is the probability
+——+——+——+ to remain at the same state
This matrix is also called Transition Matrix. If the Markov chain has N possible states, the matrix will be an NxN matrix. Each row of this matrix should sum to 1. In addition to this, a Markov chain also has an Initial State Vector of order Nx1. These two entities are a must to represent a Markov chain.
N-step Transition Matrix: Now let us learn higher-order transition matrices. It helps us to find the chance of that transition occurring over multiple steps. To put in simple words, what will be the probability of moving from state A to state E over the N step? There is actually a very simple way to calculate it. This can be determined by calculating the value of entry (A, ) of the matrix obtained by raising the transition matrix to the power of N.
Types of Markov Chain :
discrete-time Markov chains : This implies the index set T( state of the process at time t ) is a countable set here or we can say that changes occur at specific states. Generally, the term “Markov chain” is used for DTMC.
continuous-time Markov chains: Here the index set T( state of the process at time t ) is a continuum, which means changes are continuous in CTMC.
Properties of Markov Chain :
- A Markov chain is said to be Irreducible if we can go from one state to another in a single or more than one step.
- A state in a Markov chain is said to be Periodic if returning to it requires a multiple of some integer larger than 1, the greatest common divisor of all the possible return path lengths will be the period of that state.
- A state in a Markov chain is said to be Transient if there is a non-zero probability that the chain will never return to the same state, otherwise, it is Recurrent.
- A state in a Markov chain is called Absorbing if there is no possible way to leave that state. Absorbing states do not have any outgoing transitions from it.
Markov Chain in Python :
Application of Markov Chain :
Markov chains make the study of many real-world processes much more simple and easy to understand. Using the Markov chain we can derive some useful results such as Stationary Distribution and many more.
- MCMC(Markov Chain Monte Carlo), which gives a solution to the problems that come from the normalization factor, is based on Markov Chain.
- Markov Chains are used in information theory, search engines, speech recognition etc.
- Markov chain has huge possibilities, future and importance in the field of Data Science and the interested readers are requested to learn this stuff properly for being a competent person in the field of Data Science.
Assumptions for Markov Chain :
- The statistical system contains a finite number of states.
- The states are mutually exclusive and collectively exhaustive.
- The transition probability from one state to another state is constant over time.
Markov processes are fairly common in real-life problems and Markov chains can be easily implemented because of their memorylessness property. Using Markov chain can simplify the problem without affecting its accuracy.
Let us take an example to understand the advantage of this tool, suppose my friend is suggesting to have a meal. I may say that I do not want a pizza as I have that one hour ago. But Is it appropriate if I say that I do not want a pizza because I have it two months ago? That means in this case, my probability of picking a meal is entirely dependent on my immediately preceding meal. Here is the effectiveness of the Markov Chain.