processes scanned images or even smartphone photos in PDF format and creates PDF documents containing recognized text. To add it to your project, you just need to get Aspose.OCR
Aspose Maven Repository or specify Aspose Maven Repository configuration and install it within your Maven-based project by adding the following configurations to the pom.xml. For Graddle, Ivy, Sbt examples check out our repository .
Maven Dependency
<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-ocr</artifactId>
<version>22.5</version>
</dependency>
With Java OCR and just a few lines of code, you can create full-featured application that converts an PDF image to XLS document:
- Create an instance of AsposeOcr class
- Call AsposeOCR.RecognizePage method
- Pass the PDF file path as parameter
- AsposeOCR.RecognizePage returns a String or file of XLS type
System Requirements
Before running the example, make sure that Java 2 Platform, Standard Edition (J2SE) 6.0 (1.6) or later is installed on your system.
- JDK 1.6 or higher is installed.
//Create API instance
AsposeOCR api = new AsposeOCR();
//Prepare rectangles with texts.
ArrayList rectArray = new ArrayList();
rectArray.add(new Rectangle(138, 352, 2033, 537));
rectArray.add(new Rectangle(147, 890, 2033, 1157));
String result = api.RecognizePage("srcImage.png", rectArray);
System.out.println("Result with rect: " + result);
PDF What is PDF File Format
Portable Document Format (PDF) is a type of document created by Adobe back in 1990s. The purpose of this file format was to introduce a standard for representation of documents and other reference material in a format that is independent of application software, hardware as well as Operating System. The PDF file format has full capability to contain information like text, images, hyperlinks, form-fields, rich media, digital signatures, attachments, metadata, Geospatial features and 3D objects in it that can become as part of source document.
Read MoreXLS What is XLS File Format
Files with XLS extension represent Excel Binary File Format. Such files can be created by Microsoft Excel as well as other similar spreadsheet programs such as OpenOffice Calc or Apple Numbers. File saved by Excel is known as Workbook where each workbook can have one or more worksheets. Data is stored and displayed to users in table format in worksheet and can span numeric values, text data, formulas, external data connections, images, and charts. Applications like Microsoft Excel lets you export workbook data to several different formats including PDF, CSV, XLSX, TXT, HTML, XPS, and several others. The XLS file format was replaced with a more open and structured format, XLSX, with the release of Microsoft Excel 2007. The latest versions still provide support for creating and reading XLS files, though XLSX is the first choice of use now.
Read More