PDF Document Extraction Solution
Extract images & text from PDF documents with free cross-platform Apps and APIs
How to Parse PDF File Using Aspose Library
Why use parsing PDF documents? To Parse PDF File, we’ll use Aspose.PDF API, which is a feature-rich, powerful, and easy-to-use document manipulation API. Open NuGet package manager, search for Aspose.PDF and install. You may also use the following command from the Package Manager Console. Parse PDF documents is a term releated to extraction variuous kind of information from PDF file. Parse PDF document to extract text and images. Also, for separating PDF as text and images. Aspose.PDF Library allows you extract text from PDF and from stamps, extract images and fonts from PDF, extract data from tables and forms.
High Code APIs to Parse Document
Native APIs to PDF files using .NET, .NET Core, Xamarin, Java, C++ & Android
Parse PDF Files
// Open document Document pdfDocument = new Document(dataDir + "ExtractTextAll.pdf"); // Create TextAbsorber object to extract text TextAbsorber textAbsorber = new TextAbsorber(); // Accept the absorber for all the pages pdfDocument.Pages.Accept(textAbsorber); // Get the extracted text string extractedText = textAbsorber.Text; // Create a writer and open the file TextWriter tw = new StreamWriter(dataDir + "extracted-text.txt"); // Write a line of text to the file tw.WriteLine(extractedText); // Close the stream tw.Close();
About Aspose.PDF APIAspose.PDF API can be used for PDF document manipulation and parsing within applications. One can create, modify, compress, secure, print or save PDF to TXT, HTML, PCL, XFA, XML, XPS, EPUB, TEX, Images and more formats. Aspose.PDF is a standalone API and it does not depend on any software including Adobe Acrobat.
Online PDF Parser Live Demos
Extract text and images from PDF documents right now by visiting our Live Demos website. The live demo has the following benefits