Skip to content
Related Articles

Related Articles

Improve Article

Biopython – Entrez Database Search Operation

  • Last Updated : 03 Jan, 2021

The NCBI provides an online search system named Entrez. This provides access to a wide range of databases of the molecular biology and it also provides an integrated global query system which supports the boolean operators and the field search. The results are returned from all databases containing information like number of hits, links to originating database, etc from each database.

Functions used

Biopython Entrez comes equipped with 2 methods to perform search operation on databases:

 Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.  

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning - Basic Level Course

  • Biopython has an Entrez specific method named esearch() to search any one of the Entrez databases. It accepts to positional parameters database and the term which we have to search. If wrong database is assigned then it will raise an error.

Syntax:



Bio.Entrez.esearch(database, term)

  • To search any query across all the databases, egquery() method is used. It is similar to the Entrez.esearch() methods except it only takes the term parameter skipping the database parameter.

Syntax:

Bio.Entrez.egquery(term)

Approach

  • Import the required modules.
  • Set your email to identify who is connected with the database.
  • Set the Entrez tool parameter, it is Biopython by default.
  • Use any of the methods provided above with appropriate parameters.
  • The data returned will be in XML format, so to get this data in python object Entrez.read() method is used to read the object
  • Read the information provided.

Implementation using both methods is given below:

Example 1: Using esearch()

Python3




# Import libraries
from Bio import Entrez
  
# Setting email
Entrez.email = 'jeetesh1@yopmail.com'
  
# Setting Entrez tool parameter
Entrez.tool = 'Demoscript'
  
# Searching for database
info = Entrez.esearch(db="nucleotide", term="genome")
  
# reading records
record = Entrez.read(info)
  
# Showing records
print(record)

Output:

Example 2: Using egquery()

Python3




# Import libraries
from Bio import Entrez
  
# Setting email
Entrez.email = 'jeetesh1@yopmail.com'
  
# Setting Entrez tool parameter
Entrez.tool = 'Demoscript'
  
# Searching for database
info = Entrez.egquery(term="genome")
  
record = Entrez.read(info)
for row in record["eGQueryResult"]:
    print(row["DbName"], row["Count"])

Output :




My Personal Notes arrow_drop_up
Recommended Articles
Page :