Skip to content
Related Articles

Related Articles

Scraping Covid-19 statistics using BeautifulSoup
  • Difficulty Level : Easy
  • Last Updated : 29 Dec, 2020

Coronavirus, one of the biggest pandemic has brought all the world to Danger. Along with this, it is one of the trending News, everyone has this day. In this article, we will be scraping data and printing Covid-19 statistics in human-readable form. The data will be scraped from this website
Prerequisites:
 

  • The libraries ‘requests’, ‘bs4’, and ‘texttable’ have to be installed –
     
pip install bs4
pip install requests
pip install texttable

Project : Let’s head over to code, create a file called run.py. 

Python3




# importing modules
import requests
from bs4 import BeautifulSoup
 
# URL for scrapping data
 
# get URL html
page = requests.get(url)
soup = BeautifulSoup(page.text, 'html.parser')
 
data = []
 
# soup.find_all('td') will scrape every
# element in the url's table
data_iterator = iter(soup.find_all('td'))
 
# data_iterator is the iterator of the table
# This loop will keep repeating till there is
# data available in the iterator
while True:
    try:
        country = next(data_iterator).text
        confirmed = next(data_iterator).text
        deaths = next(data_iterator).text
        continent = next(data_iterator).text
 
        # For 'confirmed' and 'deaths',
        # make sure to remove the commas
        # and convert to int
        data.append((
            country,
            int(confirmed.replace(', ', '')),
            int(deaths.replace(', ', '')),
            continent
        ))
 
    # StopIteration error is raised when
    # there are no more elements left to
    # iterate through
    except StopIteration:
        break
 
# Sort the data by the number of confirmed cases
data.sort(key = lambda row: row[1], reverse = True)

To print the data in human-readable format, we will use the library ‘texttable
 

Python3






# create texttable object
import texttable as tt
table = tt.Texttable()
 
# Add an empty row at the beginning for the headers
table.add_rows([(None, None, None, None)] + data)
 
# 'l' denotes left, 'c' denotes center,
# and 'r' denotes right
table.set_cols_align(('c', 'c', 'c', 'c')) 
table.header((' Country ', ' Number of cases ', ' Deaths ', ' Continent '))
 
print(table.draw())

Output:
 

+---------------------------+-------------------+----------+-------------------+
|          Country          |  Number of cases  |  Deaths  |     Continent     |
+===========================+===================+==========+===================+
|       United States       |      644348       |  28554   |   North America   |
+---------------------------+-------------------+----------+-------------------+
|           Spain           |      180659       |  18812   |      Europe       |
+---------------------------+-------------------+----------+-------------------+
|           Italy           |      165155       |  21645   |      Europe       |
+---------------------------+-------------------+----------+-------------------+
...


NOTE: The output depends on the current statistics
Stay home, stay safe!
 

 Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.  

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course

My Personal Notes arrow_drop_up
Recommended Articles
Page :