Skip to content
Related Articles

Related Articles

How to extract images from PDF in Python?

View Discussion
Improve Article
Save Article
  • Difficulty Level : Easy
  • Last Updated : 03 Jan, 2021
View Discussion
Improve Article
Save Article

In this article, the task is to extract images from PDF in Python. We will extract the images from PDF files and save them using PyMuPDF library. First, we would have to install the PyMuPDF library using Pillow.

pip install PyMuPDF Pillow

PyMuPDF is used to access PDF files. To extract images from PDF file, we need to follow the steps mentioned below-

  • Import necessary libraries
  • Specify the path of the file from which you want to extract images and open it
  • Iterate through all the pages of PDF and get all images objects present on every page
  • Use getImageList() method to get all image objects as a list of tuples
  • To get the image in bytes and along with the additional information about the image, use extractImage()

Note: To download the PDF file click here.

Below is the implementation.

Python3




# STEP 1
# import libraries
import fitz
import io
from PIL import Image
  
# STEP 2
# file path you want to extract images from
file = "/content/pdf_file.pdf"
  
# open the file
pdf_file = fitz.open(file)
  
# STEP 3
# iterate over PDF pages
for page_index in range(len(pdf_file)):
    
    # get the page itself
    page = pdf_file[page_index]
    image_list = page.getImageList()
      
    # printing number of images found in this page
    if image_list:
        print(f"[+] Found a total of {len(image_list)} images in page {page_index}")
    else:
        print("[!] No images found on page", page_index)
    for image_index, img in enumerate(page.getImageList(), start=1):
        
        # get the XREF of the image
        xref = img[0]
          
        # extract the image bytes
        base_image = pdf_file.extractImage(xref)
        image_bytes = base_image["image"]
          
        # get the image extension
        image_ext = base_image["ext"]

Output:


My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!