Prerequisite:
In this article, we will discuss web scraping of videos using python. For web scraping, we will use requests and BeautifulSoup Module in Python. The requests library is an integral part of Python for making HTTP requests to a specified URL. Whether it be REST APIs or Web Scraping, requests are must be learned for proceeding further with these technologies. When one makes a request to a URI, it returns a response. Python requests provide inbuilt functionalities for managing both the request and response.
pip install requests
Beautiful Soup is a Python library designed for quick turnaround projects like screen-scraping.
pip install bs4
Let’s Understand Step by step implementation:
Python3
import requests
from bs4 import BeautifulSoup
|
Python3
Web_url = "Enter WEB URL"
r = requests.get(Web_url)
soup = BeautifulSoup(r.content, 'html5lib' )
|
- Count How many videos are there on the web page. In HTML For displaying video, we use video tag.
Python3
video_tags = soup.findAll( 'video' )
print ( "Total " , len (video_tags), "videos found" )
|
- Iterate through all video tags and fetch video URL
Python3
for video_tag in video_tags:
video_url = video_tag.find( "a" )[ 'href' ]
print (video_url)
|
Below is the Implementation:
Python3
import requests
from bs4 import BeautifulSoup
r = requests.get(Web_url)
soup = BeautifulSoup(r.content, 'html.parser' )
video_tags = soup.findAll( 'video' )
print ( "Total " , len (video_tags), "videos found" )
if len (video_tags) ! = 0 :
for video_tag in video_tags:
video_url = video_tag.find( "a" )[ 'href' ]
print (video_url)
else :
print ( "no videos found" )
|
Output:
Total 1 videos found
https://media.geeksforgeeks.org/wp-content/uploads/15.webm