Why Choose Aspose OCR library for Java?

Unlock powerful OCR with Aspose OCR Java library. Our Java API is an efficient, user-friendly, and cost-effective OCR API. In just five lines of Java code, add powerful OCR functionality to your applications without needing to understand neural networks and other technical details.

Our OCR engine provides unmatched speed and accuracy, supporting over 130 languages, including English, Cyrillic, Arabic, Persian, Chinese, Japanese, Korean, Hindi, Tamil and many more. Whether you work with scans, smartphone photos, screenshots, or PDFs, our OCR extracts text and generates results in all popular formats.
Image preprocessing automatically corrects rotated, blurry, inverted, and noisy images to ensure the highest recognition accuracy under any conditions.

Illustration ocr

Swift and precise OCR

Achieve high-speed and accurate OCR results with our advanced Java technology.

Multilingual support

Recognize text in 140+ languages, including English, French, Cyrillic, Arabic, Persian, Indic, Chinese, Japanese, Korean, Tamil and other scripts.

All images

Process images from various sources, such as scanners, cameras, and smartphones.

Mixed language detection

Recognize documents written in mixed languages, such as Chinese/English, Arabic/French, Hindi/English, and Cyrillic/English.

Any font, style and format

Accurately preserve text layout, detect table structure, and seamlessly recognize text regardless of the font styles.

Live code sample

Experience the simplicity: transform image to text in few lines of Java code

Ready to recognize Ready to recognize Drop a file here or click to browse *

* By uploading your files or using the service you agree with our Terms of use and Privacy Policy.

Recognition result
 

Convert image to text

Explore more examples >
AsposeOCR api = new AsposeOCR();
// Add images to the recognition batch
OcrInput images  = new OcrInput(InputType.SingleImage);
images.add("image1.png");
images.add("image2.png");
// Recognition language
RecognitionSettings recognitionSettings = new RecognitionSettings();
recognitionSettings.setLanguage(Language.Eng);
// Recognize images
ArrayList<RecognitionResult> results = api.Recognize(images, recognitionSettings);
results.forEach((result) -> {
  System.out.println(result.recognition_text);
});

Cross-platform

Aspose Java OCR code seamlessly operates on any platform supporting Java SE 6.0 or above – be it a local machine, web server, or the cloud.

Microsoft Windows
Linux
MacOS
GitHub
Microsoft Azure
Amazon Web Services
Docker

Supported file formats

Aspose.OCR for Java can work with any file you can get from a scanner or camera. Recognition results can be saved, imported to a database, or analyzed in real time.

Images

  • PDF
  • JPEG
  • PNG
  • TIFF
  • GIF
  • Bitmap

Batch OCR

  • Multi-page PDF
  • ZIP
  • Folder

Recognition results

  • Text
  • PDF
  • Microsoft Word
  • Microsoft Excel
  • HTML
  • RTF
  • ePub
  • JSON
  • XML

Easy installation

Aspose.OCR for Java is distributed as a lightweight Java Archive (JAR) file or as a downloadable file with minimal dependencies. Simply install it into your project, and you’re all set to recognize texts in multiple supported languages and save recognition results in various formats.

Request a trial license to kickstart the development of a fully functional OCR application without limitations.

Works everywhere

Our Java library fully supports Java SE 6 or above, enabling your applications to run seamlessly on any platform – desktop Windows, Windows Server, macOS, Linux, and the cloud.

140+ Recognition Languages

Our Java OCR library is a universal solution for document processing, data extraction, and content digitization on a global scale. With support for a vast array of European, Middle-East and Asian writing scripts, it is well-adapted for any country and business.

Aspose OCR for Java recognizes text in multilingual documents, such as Chinese/English, Arabic/French, or Cyrillic/English. The following languages are supported:

  • Extended Latin: English, Spanish, French, Indonesian, Portuguese, German, Vietnamese, Turkish, Italian, Polish, and 80+ more;
  • Cyrillic alphabet: Russian, Ukrainian, Kazakh, Bulgarian, including mixed Cyrillic/English texts;
  • Arabic, Persian, Urdu, including texts mixed with English;
  • Chinese, Korean, Japanese, Devanagari, and Dravidian languages, including Hindi, Tamil, Marathi, and others.

Features and capabilities

Aspose.OCR for Java Explore the advanced features and capabilities of Aspose OCR for Java.

Feature icon

Photo OCR

Extract text from smartphone photos with scan-level accuracy.

Feature icon

Searchable PDF

Convert any scan into a searchable and editable document.

Feature icon

URL recognition

Recognize an image from URL without downloading it locally.

Feature icon

Bulk recognition

Read all images from multi-page documents, folders and archives.

Feature icon

Any font and style

Identify and recognize text in all popular typefaces and styles.

Feature icon

Fine-tune recognition

Adjust every OCR parameter for best recognition results.

Feature icon

Spell checker

Improve results by automatically correcting misspelled words.

Feature icon

Find text in images

Search for text or regular expression within a set of images.

Feature icon

Compare image texts

Compare texts on two images, regardless of the case and layout.

Feature icon

Worldwide

Extract text of any language with automatic language detection.

Feature icon

Key detail extraction

Automatically extract important details from ID cards.

Feature icon

Full integration with Aspose solutions

Integrate OCR seamlessly with other Aspose products for a comprehensive and efficient Java solution.

Code samples

Explore the code samples to learn how to seamlessly integrate OCR API into your Java applications.

Installation

As a Java Archive (JAR) file with minimal dependencies or from Maven repository, Aspose OCR for Java is easily distributed. Integration into your project, directly from your preferred Java Integrated Development Environment (IDE), is a seamless process. Simply install it, and you’re prepared to leverage the complete range of OCR capabilities, saving recognition results in any of the supported formats.

Post-installation, you can promptly commence using Aspose.OMR for Java, albeit with certain limitations. A temporary license lifts all trial version restrictions for 30 days. Utilize this period to initiate the development of a fully functional OCR application, allowing you to make an informed decision on purchasing Aspose.OCR for Java at a later stage.

Recognize text on scanned images in Java

Overcome the challenge of OCR applications lacking widespread scanners. Our API boasts powerful built-in image pre-processing filters that adeptly handle rotated, skewed, and noisy images. Combined with support for all image formats, it ensures reliable recognition even from smartphone photos. Most pre-processing and image correction are automated, requiring your intervention only in challenging cases.

Apply Automatic Image Corrections - Java

// Create instance of OCR API
AsposeOCR api = new AsposeOCR();

// Define pre-processing filters
PreprocessingFilter filters = new PreprocessingFilter();
filters.add(PreprocessingFilter.ToGrayscale());
filters.add(PreprocessingFilter.Rotate(-90));

// Pre-process image before recognition
BufferedImage imageRes = api.PreprocessImage(imagePath, filters);

// Recognize image
RecognitionResult result = api.RecognizePage(imageRes, set);

Extract text from photos in Java

Integrate text detection and recognition in your Java applications. Access precise results from photos with ease, enhancing your image processing capabilities. Elevate image processing capabilities, obtaining accurate results from photos.

Detect and recognize text on photo - Java

// Add a photo to the recognition batch
OcrInput images  = new OcrInput(InputType.SingleImage);
images.add("photo.jpg");

// Set photo recognition mode
RecognitionSettings recognitionSettings = new RecognitionSettings();
recognitionSettings.setDetectAreasMode(DetectAreasMode.PHOTO);

// Extract text from a photo
ArrayList<RecognitionResult> results = api.Recognize(images, recognitionSettings);
results.forEach((result) -> {
  System.out.println(result.recognition_text);
});

Resource Optimization in Java

Optical character recognition demands resources. Our API offers flexible ways to balance the classic time-price-quality triad. It allows you to restrict the number of threads utilized by the recognition engine. While this adjustment may lead to a slower recognition speed, it enables you to allocate resources for concurrent tasks like parallel image processing, web server operations, database management, or background data analysis.

  • Choose between thorough recognition and fast recognition.
  • Specify the number of threads allocated for recognition, or allow the library to automatically scale to the number of processor cores.
  • Free up the CPU by offloading calculations to the GPU.

Balancing resources uasage

RecognitionSettings recognitionSettings = new RecognitionSettings();
recognitionSettings.setThreadsCount(2);

Fast Recognition with minimal setup

If you images are high-quality scans without skew or distortion, you can use the fastest recognition mode that consumes minimum possible resources using:

Fast Recognition OCR - Java

AsposeOCR api = new AsposeOCR();

// Add images to the recognition batch
OcrInput images  = new OcrInput(InputType.SingleImage);
images.add(os.path.join(self.dataDir, "source1.png"));
images.add(os.path.join(self.dataDir, "source2.png"));

// Fast recognize images
ArrayList<RecognitionResult> results = api.RecognizeFast(images);
results.forEach((result) -> {
  System.out.println(result);
});