Let us see how to read a PDF that is converting a textual PDF file into audio.
Packages Used:
- pyttsx3: It is a Python library for Text to Speech. It has many functions which will help the machine to communicate with us. It will help the machine to speak to us
- PyPDF2: It will help to the text from the PDF. A Pure-Python library built as a PDF toolkit. It is capable of extracting document information, splitting documents page by page, merging documents page by page etc.
Both these modules need to be installed
pip install pyttsx3
pip install PyPDF2
You also need to know about the open() function which will help us to open the PDF in read mode. Knowledge about the OOPS Concept is also recommended.
Here is the link of the PDF which is read in the example: https://drive.google.com/file/d/1zhf7-_v6CVUtgd_XMK562mg6ciewi1QR/view?usp=sharing
Approach:
- Import the PyPDF2 and pyttx3 modules.
- Open the PDF file.
- Use PdfFileReader() to read the PDF. We just have to give the path of the PDF as the argument.
- Use the getPage() method to select the page to be read.
- Extract the text from the page using extractText().
- Instantiate a pyttx3 object.
- Use the say() and runwait() methods to speak out the text.
Now here the code for it
Python3
import PyPDF2
import pyttsx3
path = open ( 'file.pdf' , 'rb' )
pdfReader = PyPDF2.PdfFileReader(path)
from_page = pdfReader.getPage( 24 )
text = from_page.extractText()
speak = pyttsx3.init()
speak.say(text)
speak.runAndWait()
|
Output:
Whether you're preparing for your first job interview or aiming to upskill in this ever-evolving tech landscape,
GeeksforGeeks Courses are your key to success. We provide top-quality content at affordable prices, all geared towards accelerating your growth in a time-bound manner. Join the millions we've already empowered, and we're here to do the same for you. Don't miss out -
check it out now!
Last Updated :
15 Jul, 2021
Like Article
Save Article