Parse PPTX Formats in Java
Native and high performance PPTX document parsing using server-side Aspose.Slides for Java APIs, without the use of any software like Microsoft or Adobe PDF.
Parse PPTX File Using Java
In order to parse PPTX file, we’ll use
API which is a feature-rich, powerful and easy to use parsing API for Java platform. You can download its latest version directly from
and install it within your Maven-based project by adding the following configurations to the pom.xml.
Repository
<repository>
<id>AsposeJavaAPI</id>
<name>Aspose Java API</name>
<url>https://releases.aspose.com/java/repo/</url>
</repository>
Dependency
<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-slides</artifactId>
<version>version of aspose-slides API</version>
<classifier>jdk17</classifier>
</dependency>
How to Parse PPTX Files in Java
A basic document parsing with Aspose.Slides for Java APIs can be done with just few lines of code.
Load PPTX file by instatiating Presentation class.
Get first slide text frames.
Loop through each paragraph portion.
Get the required output like text, font etc.
System Requirements
Aspose.Slides for Java supports on all major platforms and Operating Systems. Please make sure that you have the following prerequisites.
- Microsoft Windows or a compatible OS with Java Runtime Environment for JSP/JSF Application and Desktop Applications.
- Get latest version of Aspose.Slides for Java directly from Maven .
Parse PPTX Files - Java
//Load PPTX file
Presentation pptxPresentation = new Presentation("demo.pptx");
try{
//Get an Array of TextFrameEx objects from the first slide
ITextFrame[] textFramesSlideOne = SlideUtil.getAllTextBoxes(pptxPresentation.getSlides().get_Item(0));
//Loop through the Array of TextFrames
for (int i = 0; i < textFramesSlideOne.length; i++){
//Loop through paragraphs in current TextFrame
for (IParagraph para : textFramesSlideOne[0].getParagraphs()){
//Loop through portions in the current Paragraph
for (IPortion port : para.getPortions()){
//Display text in the current portion
System.out.print(port.getText());
//Display font height of the text
System.out.print(port.getPortionFormat().getFontHeight());
//Display font name of the text
System.out.print(port.getPortionFormat().getLatinFont().getFontName());
}
}
}
} finally {
if (pptxPresentation != null) pptxPresentation.dispose();
}
//Similarly extarcting text from the Whole Presentation
//Use getAllTextFrames(pptxPresentation, true) method and Iterate through Array
About Aspose.Slides for Java API
Aspose.Slides API can be used to read, write, manipulate and convert Microsoft PowerPoint documents to PDF, XPS, HTML, TIFF, ODP and various other formats. One can create new files from scratch and save those in the relevant supported formats. Aspose.Slides is a standalone API for creating, parsing or manipulating presentations, slides and elements and it does not depend on any software like Microsoft or OpenOffice.Online PPTX Parser Live Demos
PPTX What is PPTX File Format?
Files with PPTX extension are presentation files created with popular Microsoft PowerPoint application. Unlike the previous version of presentation file format PPT which was binary, the PPTX format is based on the Microsoft PowerPoint open XML presentation file format. A presentation file is a collection of slides where each slide can comprise of text, images, formatting, animations, and other media. These slides are presented to audience in the form of slideshows with custom presentation settings.
Read More