Biopython – Pairwise Alignment
Pairwise Sequence Alignment is a process in which two sequences are compared at a time and the best possible sequence alignment is provided. Pairwise sequence alignment uses a dynamic programming algorithm. Biopython has a special module Bio.pairwise2 which identifies the alignment sequence using pairwise method. Biopython provides the best algorithm to find alignment sequence as compared to other software.
Let’s take two simple and hypothetical sequences as an example for using pairwise module.
Here, the globalxx method does the main work, it follows the convention <alignment type>XX where XX is a code having two characters indicating the parameters it takes. First character indicates the matching and mismatching score while the second indicates the parameter for the gap penalty.
Match parameters :
Code Character Description x No parameters. Identical character has score of 1, else 0. m match score of identical chars, else mismatch score. d dictionary returning scores of any pair of characters. c A callback function returns scores.
gap penalty parameters :
Code Character Description x No gap penalties. s both sequences having same open and extend gap penalty. d sequences having different open and extend gap penalty. c A callback function returns the gap penalties.
For a nice printout Bio.pairwise2 provides format_alignment() method:
There is another module provided by Biopython to do the pairwise sequence alignment. Align module has a PairwiseAligner() for this purpose. It has various APIs to set the parameters like mode, match score, algorithm, gap penalty, etc. Below is a simple implementation of the method:
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course