Scraping Television Rating Point using Python
In this article, We are going to write python scripts to scrape TRP(Television Rating Point) from BARC. TRP stands for Television Rating Point, It represents how many people watched which channels for how much time during a particular period. It is used to judge which television program is viewed most.
Module needed:
- bs4: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come built-in with Python. To install this type the below command in the terminal.
pip install bs4
- requests: Request allows you to send HTTP/1.1 requests extremely easily. This module also does not come built-in with Python. To install this type the below command in the terminal.
pip install requests
Let’s see the stepwise execution of the script
Step 1: Import all dependence
Python3
# import module import requests from bs4 import BeautifulSoup |
Step 2: Create a URL get function
Python3
# user define function # Scrape the data def getdata(url): r = requests.get(url) return r.text |
Step 3: Now pass the URL into the getdata() function and Convert that data into HTML code
Python3
soup = BeautifulSoup(htmldata, 'html.parser' ) data = '' for i in soup.find_all( 'tbody' ): data = data + (i.get_text()) data |
Output:
Note: These scripts will give you only Raw data in String format you have to print your data with your needs.
Step 4: Now traverse the data.
Python3
data = ' '.join((filter(lambda i: i not in [' \t'], data))) print (data) |
Output:
Please Login to comment...