Biopython – Entrez Database Connection
The NCBI provides an online search system named Entrez. This provides access to a wide range of databases of the molecular biology and it also provides an integrated global query system which supports the boolean operators and the field search. The results are returned from all databases containing information like number of hits, links to originating database, etc from each database.
Biopython has an Entrez specific module named Bio.Entrez for this purpose.The Entrez module purses the information from the XML file returned by the Entrez search system and displays it as python dictionary and lists. Steps to connect the database are listed below :
- Import the required modules.
- Set email to identify who is connected.
- Set the Entrez tool parameter, it is Biopython by default.
- Call the einfo() method to get information about each database.
- Read the information provided by the einfo() method.
- The data so obtained is in XML format, so to get this data in python object read() method is used
- Now the record is in a dictionary format having only one key.
- By accessing the DbList key, a list of database is returned.
The resulting program should look like the code given below :
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course