Skip to content
Related Articles
Get the best out of our app
GeeksforGeeks App
Open App
geeksforgeeks
Browser
Continue

Related Articles

Image Scraping with Python

Improve Article
Save Article
Like Article
Improve Article
Save Article
Like Article

Scraping Is a very essential skill for everyone to get data from any website. In this article, we are going to see how to scrape images from websites using python. For scraping images, we will try different approaches.

Method 1: Using BeautifulSoup and Requests

  • bs4: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come built-in with Python. To install this type the below command in the terminal.
pip install bs4
  • requests:  Requests allows you to send HTTP/1.1 requests extremely easily. This module also does not come built-in with Python. To install this type the below command in the terminal.
pip install requests

Approach:

  • Import module
  • Make requests instance and pass into URL
  • Pass the requests into a Beautifulsoup() function
  • Use ‘img’ tag to find them all tag (‘src ‘)

Implementation:

Python3




import requests 
from bs4 import BeautifulSoup 
    
def getdata(url): 
    r = requests.get(url) 
    return r.text 
    
htmldata = getdata("https://www.geeksforgeeks.org/"
soup = BeautifulSoup(htmldata, 'html.parser'
for item in soup.find_all('img'):
    print(item['src'])

Output:

https://media.geeksforgeeks.org/wp-content/cdn-uploads/20201018234700/GFG-RT-DSA-Creative.png
https://media.geeksforgeeks.org/wp-content/cdn-uploads/logo-new-2.svg

Method 2: Using urllib and BeautifulSoup

urllib : It is a Python module that allows you to access, and interact with, websites with their URL. To install this type the below command in the terminal.

pip install urllib

Approach:

  • Import module
  • Read URL with urlopen()
  • Pass the requests into a Beautifulsoup() function
  • Use ‘img’ tag to find them all tag (‘src ‘)

Implementation:

Python3




from urllib.request import urlopen
from bs4 import BeautifulSoup
  
htmldata = urlopen('https://www.geeksforgeeks.org/')
soup = BeautifulSoup(htmldata, 'html.parser')
images = soup.find_all('img')
  
for item in images:
    print(item['src'])

Output:

https://media.geeksforgeeks.org/wp-content/cdn-uploads/20201018234700/GFG-RT-DSA-Creative.png
https://media.geeksforgeeks.org/wp-content/cdn-uploads/logo-new-2.svg


My Personal Notes arrow_drop_up
Last Updated : 08 Sep, 2021
Like Article
Save Article
Similar Reads
Related Tutorials