Why Aspose.OCR for Python via Java?

Aspose.OCR for Python via Java offers robust OCR functionality with an easy-to-use API. Convert images and PDFs to text in Python with just a few lines of code. Benefit from superior speed and accuracy, supporting 27 languages including Latin, Cyrillic, and Chinese scripts. Recognize scanned images, smartphone photos, screenshots, and scanned PDFs, saving results in popular document formats. Advanced pre-processing filters handle rotated, skewed, and noisy images. Optimize performance by offloading tasks to the GPU.

Illustration ocr

Swift and Accurate OCR

Achieve high-speed and accurate OCR results with our advanced Python via Java technology.

Multilingual Support

Recognize text in 27 languages, including Latin, Cyrillic, and Chinese scripts, ensuring versatility for your Python applications via Java.

Flexible Image Support

Process images from scanners, cameras, and smartphones seamlessly with Python via Java.

Precision in Chinese Character Recognition

Recognize over 6,000 Chinese characters with precision in your Python projects via Java.

Preserve Font Styles and Formatting

Maintain font styles and formatting for accurate representation of recognized text in your Python applications via Java.

Easy to Use

Initiate text recognition from images in just three lines of code. Experience the simplicity!

Ready to recognize Ready to recognize Drop a file here or click to browse *

* By uploading your files or using the service you agree with our Terms of use and Privacy Policy.

Recognition result
 

Explore the simplicity of OCR processing with our live sample.

More examples >
# Initialize OCR engine
recognitionEngine = AsposeOcr()

# Add image to batch
input = OcrInput(InputType.SINGLE_IMAGE)
input.add("sample.png")

# Extract text from image
result = recognitionEngine.recognize(input)

# Display the recognition result
print(result[0].recognition_text)

Java Backend Compatibility

Aspose.OCR for Python via Java seamlessly integrates with any platform supporting Java - whether on desktop Windows, Windows Server, macOS, Linux, or the cloud.

Microsoft Windows
Linux
MacOS
GitHub
Microsoft Azure
Amazon Web Services
Docker

Supported file formats

Aspose.OCR for Python via Java can work with virtually any file [TBD]

Source files

  • PDF
  • JPEG
  • PNG
  • TIFF
  • GIF
  • Bitmap

Recognition results

  • Multi-page PDF
  • ZIP
  • Folder

[TBD]

  • Text
  • PDF
  • Microsoft Word
  • Microsoft Excel
  • HTML
  • RTF
  • ePub
  • JSON
  • XML

Effortless Installation for Python via Java

Aspose.OCR for Python via Java is delivered as a lightweight Python package or as a downloadable file for Java backend with minimal dependencies. Easily install it into your project, and you’re ready to recognize texts in multiple supported languages and save recognition results in various formats.

Request a trial license to kickstart the development of a fully functional OCR application without limitations.

Java Backend Integration for Python Applications

Our library seamlessly integrates with the Java backend, enabling Python applications to run seamlessly on any platform – desktop Windows, Windows Server, macOS, Linux, and the cloud.

27 Recognition Languages

Our Python via Java OCR API recognizes a plethora of languages and popular writing scripts, including mixed languages:

Leave language detection to the library or define the language yourself for enhanced recognition performance and reliability.

  • Extended Latin alphabet: Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Italian, Latvian, Lithuanian, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish.
  • Cyrillic alphabet: Belorussian, Bulgarian, Kazakh, Russian, Serbian, Ukrainian.
  • Chinese: Recognize over 6,000 characters.
  • {index-content-python-java.all_texts.text3.b_li_4}

Streamlined Batch Recognition

Our Python via Java OCR API liberates you from recognizing images one by one. Employ various batch-processing methods to recognize multiple images in one call:

  • Recognition of multi-page PDFs, TIFFs, and DjVu files.
  • Recognition of all files in a folder.
  • Recognition of all files in an archive.
  • Recognition of all files from a list.
  • {index-content-python-java.all_texts.text4.b_li_5}
  • {index-content-python-java.all_texts.text4.b_li_6}
  • {index-content-python-java.all_texts.text4.b_li_7}
  • {index-content-python-java.all_texts.text4.b_li_8}
  • {index-content-python-java.all_texts.text4.b_li_9}
  • {index-content-python-java.all_texts.text4.b_li_10}

Learn by Python Examples via Java

Explore numerous examples written in Python via Java. Quickly familiarize yourself with functions and capabilities to create tailored solutions for your business needs.

{index-content-python-java.all_texts.text6.title}

{index-content-python-java.all_texts.text6.paragraph1}

{index-content-python-java.all_texts.text6.paragraph2}

{index-content-python-java.all_texts.text8.title}

{index-content-python-java.all_texts.text8.paragraph1}

  • {index-content-python-java.all_texts.text8.b_li_1}
  • {index-content-python-java.all_texts.text8.b_li_2}
  • {index-content-python-java.all_texts.text8.b_li_3}
  • {index-content-python-java.all_texts.text8.b_li_4}

{index-content-python-java.all_texts.text9.title}

{index-content-python-java.all_texts.text9.paragraph1}

Features and Capabilities

Aspose.OCR for Python via Java Explore the advanced features of Aspose.OCR for Python via Java.

Feature icon

Image and PDF to Text Conversion

Effortlessly convert images and PDFs into editable text using Python via Java OCR.

Feature icon

Support for All Image Formats

Process images in various formats - JPEG, PNG, TIFF, GIF, BMP - ensuring seamless image-to-text conversion with Python via Java.

Feature icon

Accurate Reading of Latin and Cyrillic Scripts

Read languages using Latin and Cyrillic scripts with precision, ensuring accurate OCR results via Python and Java integration.

Feature icon

Recognition of Chinese Characters

Achieve precise recognition of over 6,000 Chinese characters with advanced Python via Java OCR.

Feature icon

Identification of Popular Typefaces

Identify and recognize text in various popular typefaces, enhancing versatility in Python applications via Java.

Feature icon

Font Styles and Formats Preservation

Retain font styles and formatting for accurate text representation in Python projects via Java.

Feature icon

Whole Image or Selected Area Processing

Choose between processing the entire image or specific areas for OCR in your Python applications via Java.

Feature icon

Support for Challenging Images

Accurately recognize text in challenging images – rotated, skewed, and noisy – with ease in Python via Java.

Feature icon

Streamlined Batch Processing

Efficiently process multiple images in batch mode for increased productivity in Python applications via Java.

Python via Java Code Samples

Discover code samples to seamlessly integrate Aspose.OCR for Python via Java into your applications.

Effortless Installation

As a lightweight Python package or a downloadable file with minimal dependencies, Aspose.OCR for Python via Java ensures easy distribution. Integrate it into your project directly from Python, and you’re prepared to leverage complete OCR capabilities, saving recognition results in various formats.

Post-installation, promptly commence using Aspose.OCR for Python via Java, albeit with certain limitations. A temporary license removes all trial version restrictions for 30 days. Utilize this period to initiate the development of a fully functional OCR application, allowing you to make an informed decision on purchasing Aspose.OCR for Python via Java later.

{index-content-python-java.code_samples.item1.caption}

lic = License()
lic.set_license(self.licPath)

Image Recognition

The primary challenge in OCR applications is the scarcity of scanners for end users. Our API, seamlessly integrated with Python via Java, features robust built-in image pre-processing filters handling rotated, skewed, and noisy images. Combined with support for all image formats, it allows reliable recognition, even from smartphone photos. Most pre-processing and image correction are automated, requiring your intervention only in challenging cases.

Apply automatic image corrections - Python via Java

api = AsposeOcr()

# set preprocessing options
filters = PreprocessingFilter()
filters.add(PreprocessingFilter.auto_skew())

# Create OcrInput and add images
input = OcrInput(InputType.SINGLE_IMAGE, filters)
input.add("sample.png")

# set recognition options
settings = RecognitionSettings()
settings.set_detect_areas_mode(DetectAreasMode.TABLE)
settings.set_threads_count(1)
settings.set_language(Language.ENG)

# recognize
result = api.recognize(input, settings)

# print result
print(res[0].recognition_text)

Universal Image Converter

The API, operable through Python via Java, reads virtually any image from a scanner, camera, or smartphone: PDF documents, JPEG, PNG, TIFF, GIF, BMP images, and even DjVu files. Fully supporting multi-page PDF documents, TIFF and DjVu images, it also accommodates images provided through web URLs.

Recognition results are returned in the most popular document and data exchange formats: plain text, PDF, Microsoft Word, Microsoft Excel, JSON, and XML. {index-content-python-java.code_samples.item3.content2}

Recognize PDF and save results to JSON - Python via Java

api = AsposeOcr()

# Create OcrInput and add images
input = OcrInput(aspose.models.InputType.PDF)
file = os.path.join(self.dataDir, "pdfs/multi_page_1.pdf")
input.add(file, 0, 3)

set = RecognitionSettings()
set.set_detect_areas_mode(DetectAreasMode.NONE)
result = api.recognize(input, set)
api.save_multipage_document("test.xml", Format.XML, result)
api.save_multipage_document("test.json", Format.JSON, result)
api.save_multipage_document("test.pdf", Format.PDF, result)
api.save_multipage_document("test.xlsx", Format.XLSX, result)
api.save_multipage_document("test.docx", Format.DOCX, result)
api.save_multipage_document("test.txt", Format.TEXT, result)
api.save_multipage_document("test.html", Format.HTML, result)
api.save_multipage_document("test.epub", Format.EPUB, result)
api.save_multipage_document("test.rtf", Format.RTF, result)

Optimizing Resource Usage

Optical character recognition is a resource-intensive process. The API, accessible through Python via Java, provides flexible ways to strike a balance between time, price, and quality:

  • Choose between thorough recognition and fast recognition.
  • Specify the number of threads allocated for recognition, or allow the library to automatically scale to the number of processor cores.
  • Free up the CPU by offloading calculations to the GPU.
  • {index-content-python-java.code_samples.item4.b_li_4}
  • {index-content-python-java.code_samples.item4.b_li_5}
  • {index-content-python-java.code_samples.item4.b_li_6}

Fast recognition - Python via Java

api = AsposeOcr()

# Create OcrInput and add images
input = OcrInput(InputType.SINGLE_IMAGE)
input.add("border.jpg")

result = api.recognize(input, RecognitionSettings())
result_street = api.recognize_street_photo(input)
print(result[0].recognition_text)

{index-content-python-java.code_samples.item5.title}

{index-content-python-java.code_samples.item5.content1}

{index-content-python-java.code_samples.item5.caption}

api = AsposeOcr()

# Create OcrInput and add images
input = OcrInput(InputType.SINGLE_IMAGE)
input.add(os.path.join(self.dataDir, "CarNumbers.jfif"))

# recognize
result = api.recognize_car_plate(input, CarPlateRecognitionSettings())

# print result
print(result[0].recognition_text)