You can use the PyMuPDF module with
Python to extract images from a
PDF
file. You can install PyMuPDF using the
pip package
manager with the command pip install PyMuPDF
. You can determine
if it is already installed with the command pip list | grep PyMuPDF
or pip freeze | grep PyMuPDF
.
# pip list | grep PyMuPDF DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Ple ase upgrade your Python as Python 2.7 won't be maintained after that date. A fut ure version of pip will drop support for Python 2.7. PyMuPDF 1.14.13 # pip freeze | grep PyMuPDF DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Ple ase upgrade your Python as Python 2.7 won't be maintained after that date. A fut ure version of pip will drop support for Python 2.7. PyMuPDF==1.14.13 #
The code for the file is in
extract-PDF-image.py. The PDF file from which images
are to be exracted should be provided on the command line, e.g.,
./extract-PDF-image.py somefile.pdf
. If any images are found
within the file, they will be extracted as
PNG
files with names in the form img0-11_150x109.png
where the last
part of the name indicates the dimensions of the image in pixels, e.g., 150
pixels wide x 109 pixels high. As an example of a PDF file with multiple
images within it, you can use bpb13187.pdf.
Related articles: