Extract images from PDF in Java

Parse Images from PDF document. Use Aspose.PDF for Java to modify PDF files programmatically

How to extract images from PDF using Java Library

Do you need to extract images from PDF? Programmatic modification of PDF documents is an essential part of modern digital workflows. With Java libraries like Aspose.PDF, developers can extract images from PDF. These libraries are stand-alone solutions that don’t rely on other software and are ready for commercial use. They cover all possible needs of professional Java developers.

  • Extract text from PDF
  • Extract Images from PDF
  • Extract Fonts from PDF
  • Extract Data from the Form
  • Extract Text From Stamps
  • Extract Data from Table

In order to extract images from PDF file, we’ll use Aspose.PDF for Java API which is a feature-rich, powerful, and easy-to-use document manipulation API for the Java platform. You can download its latest version directly from Maven and install it within your Maven-based project by adding the following configurations to the pom.xml.

Repository

<repository>
    <id>AsposeJavaAPI</id>
    <name>Aspose Java AP</name>
    <url>https://releases.aspose.com/java/repo/</url>
</repository>

Dependency

<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-pdf</artifactId>
<version>version of aspose-pdf API</version>
</dependency>

Extract images from PDF in Java


You need Aspose.PDF for Java to try the code in your environment.

  1. Load the PDF with an instance of Document.
  2. Create an XImage object to extract images.
  3. Save output image to jpeg file.
  4. Save updated PDF file.

Extract images from PDF - Java

This sample code shows how to extract images from PDF documents

Input file:

File not added

Output format:

PDF

Output file:

    public static void Extract_Images(){
       // The path to the documents directory.
       String _dataDir = "/home/admin1/pdf-examples/Samples/";
       String filePath = _dataDir + "ExtractImages.pdf";

       // Load PDF document
       com.aspose.pdf.Document pdfDocument = new com.aspose.pdf.Document(filePath);

       com.aspose.pdf.Page page = pdfDocument.getPages().get_Item(1);
       com.aspose.pdf.XImageCollection xImageCollection = page.getResources().getImages();
       // Extract a particular image
       com.aspose.pdf.XImage xImage = xImageCollection.get_Item(1);

       try {
           java.io.FileOutputStream outputImage = new java.io.FileOutputStream(_dataDir + "output.jpg");
           // Save output image
           xImage.save(outputImage);
           outputImage.close();
       } catch (java.io.FileNotFoundException e) {
           // TODO: handle exception
           e.printStackTrace();
       } catch (java.io.IOException e) {
           // TODO: handle exception
           e.printStackTrace();
       }
   }

About Aspose.PDF for Java API

Aspose.PDF for Java API is a library that enables developers to add PDF processing capabilities to their applications. It can be used to build any type of 32-bit and 64-bit applications to generate or read, convert and manipulate PDF files without the use of Adobe Acrobat. Aspose.PDF for Java allows developers to insert tables, graphs, images, hyperlinks, custom fonts - and more - into PDF documents. Moreover, it is also possible to compress PDF. Aspose.PDF for Java provides excellent security features to develop secure PDF files.

You can find more information about Aspose.PDF for Java API on documentation and examples on how to use API. Some of the key features of Aspose.PDF for Java API include support for various file formats including HTML, XFA, TXT, PCL, XML, XPS and image file formats, support for various PDF versions, and extensive hyperlink functionality.