Extract Images from PDF via Python

Python Library for extracting Images from PDF using own APIs.

Extract Images from PDF Document Using Python Library

In order to add Image in PDF, we’ll use Aspose.PDF for .NET API which is a feature-rich, powerful and easy to use document manipulation API for python-net platform. Open NuGet package manager, search for Aspose.PDF and install. You may also use the following command from the Package Manager Console.

Python Package Manager Console

pip install aspose-pdf

Extract Image from PDF via Python


You need Aspose.PDF for .NET library to try the code in your environment.

  1. Open PDF document.
  2. Extract a particular image.
  3. Save output image.
  4. Save updated PDF file.

System Requirements


Aspose.PDF for Python is supported on all major operating systems. Just make sure that you have the following prerequisites.

  • Aspose.PDF for Python via .NET supports any 64-bit or 32-bit operating system where Python >3.5 and <3.12 is installed.
  • If you develop software for Linux, please have a look at additional requirements in Product Documentation

Extract Images from PDF File - Python

This sample code shows how to extract Images from PDF - Python

    import aspose.pdf as ap 

    input_file = DIR_INPUT + "sample_with_image.pdf"
    output_image = DIR_OUTPUT + "extract_image.jpg"
    # Open document
    document = ap.Document(input_file)

    # Extract a particular image
    xImage = document.pages[2].resources.images[1]
    outputImage = io.FileIO(output_image, "w")

    # Save output image
    xImage.save(outputImage)
    outputImage.close()
    # Save updated PDF file
    document.save(DIR_OUTPUT + "output.pdf")

About Aspose.PDF for Python API

A PDF Processing Library to create cross-platform applications with the ability to generate, modify, convert, render, secure and print documents without using Adobe Acrobat. It supports converting various file formats into PDF including HTML and converting PDF documents into various output formats. Developers can easily render all HTML content in a single Page PDF as well as convert HTML files with SVG graphic tags to Tagged PDF files. .NET PDF API offers compression, table creation, graph & image functions, hyperlinks, stamp and watermarking tasks, extended security controls & custom font handling.