Scraping Television Rating Point using Python
Last Updated :
29 Dec, 2020
In this article, We are going to write python scripts to scrape TRP(Television Rating Point) from BARC. TRP stands for Television Rating Point, It represents how many people watched which channels for how much time during a particular period. It is used to judge which television program is viewed most.
Module needed:
- bs4: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come built-in with Python. To install this type the below command in the terminal.
pip install bs4
- requests: Request allows you to send HTTP/1.1 requests extremely easily. This module also does not come built-in with Python. To install this type the below command in the terminal.
pip install requests
Let’s see the stepwise execution of the script
Step 1: Import all dependence
Python3
import requests
from bs4 import BeautifulSoup
|
Step 2: Create a URL get function
Python3
def getdata(url):
r = requests.get(url)
return r.text
|
Step 3: Now pass the URL into the getdata() function and Convert that data into HTML code
Python3
soup = BeautifulSoup(htmldata, 'html.parser' )
data = ''
for i in soup.find_all( 'tbody' ):
data = data + (i.get_text())
data
|
Output:
Note: These scripts will give you only Raw data in String format you have to print your data with your needs.
Step 4: Now traverse the data.
Python3
data = ' '.join((filter(lambda i: i not in [' \t'], data)))
print (data)
|
Output:
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...