PPT PPTX ODP POT ppsx
Aspose.Slides  for Python via .NET
ODP

Extract Text and Images from ODP presentation using Python

Build your own Python apps for extracting text, image, video and audio files from PowerPoint using server-side APIs.

Extract Text from ODP Presentation via Python

To scan the text from the whole presentation, use the GetAllTextFrames static method exposed by the SlideUtil class. The code below scans the text and formatting information from a presentation, including the master slides.

Extracting Text from ODP Presentation using Python


import aspose.slides as slides

#Instatiate Presentation class that represents a ODP file
with slides.Presentation("pres.odp") as pptxPresentation:
    # Get an Array of ITextFrame objects from all slides in the ODP
    textFramesPPTX = slides.util.SlideUtil.get_all_text_frames(pptxPresentation, True)
    
    # Loop through the Array of TextFrames
    for i in range(len(textFramesPPTX)):
	    # Loop through paragraphs in current ITextFrame
        for para in textFramesPPTX[i].paragraphs:
            # Loop through portions in the current IParagraph
            for port in para.portions:
			    # Display text in the current portion
                print(port.text)

    			# Display font height of the text
                print(port.portion_format.font_height)

			    # Display font name of the text
                if port.portion_format.latin_font != None:
                    print(port.portion_format.latin_font.font_name)

How to Extract Text from ODP via Python

These are the steps to Parse ODP files.

  1. Load ODP with an instance of Presentation

  2. Get an Array of TextFrame objects from all slides in the ODP

  3. Loop through the Array of TextFrames

  4. Loop through paragraphs in current TextFrame

  5. Loop through portions in the current Paragraph

  6. Get text in the current portion

Other Supported Parse Formats

Using Python, You can also scan the following formats: