An audiobook is nothing but a book that was recorded in an audio format. It can also be stated as a book that is being read aloud. Audiobook helps in improvising one's vocabulary, comprehension and pronunciation of words. Hope some of you guys are bookaholic but lazy to read it on your own. So it's time to create your own audiobook by using few codes of Python. This will definitely make you enjoy audiobooks without any subscription fee, that was imposed on platforms like Audible, Scribd, etc.
Prerequisite
Python has a huge number of modules that contains reusable code which performs desired functions when invoked. It must be installed into your system before using its functionalities. So here we going to grab just 2 modules from a bunch of python modules prevailing around.
- Install PyPDF2 module, using
pip install PyPDF2
- Install pyttsx3 module, using
pip install pyttsx3
PyPDF2 module
It is one of the Pure-Python libraries that runs on any Python platform without any external dependencies. It works entirely on StringIO rather than using FileStreams. PyPDF2 module performs various operations on PDF files. PyPDF2 module can perform some of the following tasks,
- Fetching document information like title, author, etc.
- Split and merge documents page by page.
- Merging multiple pages into a single page.
- Cropping a page to the required ratio.
- Encryption and decryption of PDF files.
pyttsx3 module
It is one of the popular modules in python which intakes text as an input and results in speech/audio as an output. Even this pyttsx3 module works offline, proving it a user-friendly module over other modules. It is compatible with Python version 2 as well as Python version 3.
Importing modules
The above modules must be invoked to get their functionalities used in the program. Use an import statement to import a module, like
import PyPDF2
import pyttsx3
Reading a PDF file
A PDF file must be opened first to manipulate its contents. A PDF file can be read/write, can embed with an attachment, add a bookmark within a pdf file and so on. Various operations can be performed on a pdf file using Python. We can retrieve some useful information from a PDF file like the number of pages, the layout of the page that is being used, one can retrieve a page with the help of a page number and so on. All these operations are carried over by the PyPDF2 module in Python.
To read a PDF file, use a command as
variable_name = PyPDF2.PdfFileReader(open('file_name','rb'))
where variable_name ---} the name of the variable
PyPDF2 ---} module name.
PdfFileReader( ) ---} class under PyPDF2 module.
open( ) ---} function used for opening a file.
file_name ---} name of the file that need to be opened.
rb ---} mode of file (opens file in binary format to read)
Initializing speaker
Speakers must be initialized next so that we can convert text to audio format using the pyttsx3 module. Use command,
speaker=pyttsx3.init()
to initialize speaker.
Extracting text
The first and foremost thing is getting a page with the help of its page number. The page number of the required page is passed as an argument to the getPage( ) method.
The command can be written as,
text=readpdf.getPage(pagenumber).extractText()
where , getPage(pagenumber) ---} retrieve a page with the help of page number
extractText() ---} extract text from page specified.
Text to Audio
Now it's time to pass all those extracted texts as an argument to the method named say( ) in the pyttsx3 module, which helps in converting text to an audio format. The command is as follows,
speaker.say(text)
where text extracted in the previous command is passed as an argument into the say( ) method.
Saving voice to a file
The voice generated by the above command can be saved into an mp3 file. The file will be saved in the exact location where our code has been saved. Hence saving the audio file will help users to access it in future days.
Command to save voice to a file is as follows,
speaker.save_to_file(text,'filename.mp3')
where speaker ---} the variable that is initialized already.
save_to_file( text, 'filename.mp3') ---} method used for saving an audio file.
Any bookaholics here ?????? comment your name and the book you would like to make it as an audiobook. It's time to build an audiobook on your own!!!
Code Snippet
import PyPDF2
readpdf=PyPDF2.PdfFileReader(open('19MCS053 - PYTHON RECORD.pdf','rb'))
import pyttsx3
speaker=pyttsx3.init()
Comments
Post a Comment