Web scraping is a technique to fetch data from websites. While surfing on the web, many websites don’t allow the user to save data for personal use. One way is to manually copy-paste the data, which both tedious and time-consuming. Web Scraping is the automation of the data extraction process from websites. In this article we will discuss how we can download all images from a web page using python.
- bs4: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come built-in with Python.
- requests: Requests allows you to send HTTP/1.1 requests extremely easily. This module also does not come built-in with Python.
- os: The OS module in python provides functions for interacting with the operating system. OS, comes under Python’s standard utility modules. This module provides a portable way of using operating system dependent functionality.
- Import module
- Get HTML Code
- Get list of img tags from HTML Code using findAll method in Beautiful Soup.
images = soup.findAll('img')
Create separate folder for downloading images using mkdir method in os.
- Iterate through all images and get the source URL of that image.
- After getting the source URL, last step is download the image
- Fetch Content of Image
r = requests.get(Source URL).content
- Download image using File Handling
# Enter File Name with Extension like jpg, png etc.. with open("File Name","wb+") as f: f.write(r)
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.