Aspose.PDF  for

PDF Document Extraction Solution

Extract images & text from PDF documents with free cross-platform Apps and APIs

How to Parse PDF File Using Aspose Library

Why use parsing PDF documents? To Parse PDF File, we’ll use Aspose.PDF API, which is a feature-rich, powerful, and easy-to-use document manipulation API. Open NuGet package manager, search for Aspose.PDF and install. You may also use the following command from the Package Manager Console. Parse PDF documents is a term releated to extraction variuous kind of information from PDF file. Parse PDF document to extract text and images. Also, for separating PDF as text and images. Aspose.PDF Library allows you extract text from PDF and from stamps, extract images and fonts from PDF, extract data from tables and forms.

High Code APIs to Parse Document

Native APIs to PDF files using .NET, .NET Core, Xamarin, Java, C++ & Android


Parse PDF Files

// Open document
Document pdfDocument = new Document(dataDir + "ExtractTextAll.pdf");

// Create TextAbsorber object to extract text
TextAbsorber textAbsorber = new TextAbsorber();
// Accept the absorber for all the pages
// Get the extracted text
string extractedText = textAbsorber.Text;
// Create a writer and open the file
TextWriter tw = new StreamWriter(dataDir + "extracted-text.txt");
// Write a line of text to the file
// Close the stream

About Aspose.PDF API

Aspose.PDF API can be used for PDF document manipulation and parsing within applications. One can create, modify, compress, secure, print or save PDF to TXT, HTML, PCL, XFA, XML, XPS, EPUB, TEX, Images and more formats. Aspose.PDF is a standalone API and it does not depend on any software including Adobe Acrobat.

Online PDF Parser Live Demos

Extract text and images from PDF documents right now by visiting our Live Demos website. The live demo has the following benefits

  No need to download or setup anything
  No need to write any code
  Just upload your PDF file & edit document properties
  It will be parsed instantly