Extract text from PDF in Rust

How to Extract text from PDF using Rust via C++

How to extract text from PDF using Aspose.PDF for Rust via C++

Do you need to parse PDF? The Aspose.PDF for Rust via C++ helps extract text from PDF documents. To perform the extraction, we’ll use Aspose.PDF for Rust via C++, which is an easy and secure toolkit for working with PDFs. To install and use Aspose.PDF for Go via C++, click on the Download button.

Extract text from PDF in Rust


You need Aspose.PDF for Rust via C++ to try the code in your environment.

  1. Load the PDF with an instance of Document.
  2. Create TextAbsorber object to extract text.
  3. Accept the absorber for all the pages.
  4. Get the extracted text
  5. Create a writer and open the file, write a line of text to the file

Extract text from PDF with Rust

This sample code shows how to extract text from PDF documents

Input file:

File not added

Output format:

PDF

Output file:

use asposepdf::Document;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Open a PDF-document with filename
    let pdf = Document::open("sample.pdf")?;

    // Return the PDF-document contents as plain text
    let txt = pdf.extract_text()?;

    // Print extracted text
    println!("Extracted text:\n{}", txt);

    Ok(())
}

About Aspose.PDF for Rust via C++ API

Our .NET Library can combine a document from any supported download format to any supported save format. Aspose.PDF for .NET library provides fairly universal solutions that will help you solve the tasks of merging documents. Aspose.PDF supports the most significant number of popular document formats, both for loading and saving. Draw your attention to the fact that the current section describes only popular merges. The current page provides information about merging TEXT to {{FILERESULT}}. However, there are many combinations for merging your files. For a complete list of supported formats, see the section Supported File Formats.