OCR for Java

OCR for C++

OCR for Python via Java

OCR for Python via C++

OCR for Javascript via C++

OCR for Node.js via C++

OCR for Python via .NET

Python OCR Library

Extract texts from images in your Python app using Python OCR library. Transform images into text effortlessly with concise Python API code, unlocking advanced OCR capabilities.

Download Trial Purchase

See what’s new

# Initialize OCR engine
recognitionEngine = AsposeOcr()

# Add image to batch
input = OcrInput(InputType.SINGLE_IMAGE)
input.add("sample.png")

# Extract text from image
result = recognitionEngine.recognize(input)

# Display the recognition result
print(result[0].recognition_text)

> pip install aspose-ocr-python-net

Why Aspose.OCR for Python via .NET?

Embark on a journey with Aspose OCR for Python via .NET – a versatile and user-friendly OCR API. Embed OCR functionality into your Python applications with fewer than 5 lines of code, eliminating the need for complex math or neural networks. Our powerful OCR engine delivers unparalleled speed and accuracy, supporting 130+ languages, including English, Cyrillic, Arabic, Persian, Hindi, Chinese, Japanese, Korean, Tamil and many more. Whether it’s scanned images, smartphone photos, screenshots, or scanned PDFs, obtain results in popular document and data exchange formats. Leverage pre-processing filters to handle rotated, skewed, and noisy images.

Efficient and precise OCR

Experience unparalleled speed and precision in OCR results with advanced Python technology.

Multilingual

Recognize text in 140+ languages: English, French, German, Spanish, Russian, Chinese, Hindi, Japanese, Korean, Tamil, Arabic, Persian, and more.

Universal

Effortlessly process images from diverse sources – scanners, cameras, and smartphones – using Python.

Asian languages

Achieve precise recognition of Chinese, Arabic, Devanagari and Dravidian scripts, as well as mixed-language texts.

Preserve layout

Maintain source formatting for accurate text representation, and recognize tables.

Live code sample

Convert an image to text in just THREE lines of Python code. Try for yourself!

Ready to recognize Ready to recognize Drop a file here or click to browse *

* By uploading your files or using the service you agree with our Terms of use and Privacy Policy.

Recognition result

# Initialize OCR engine
recognitionEngine = AsposeOcr()

# Add image to batch
input = OcrInput(InputType.SINGLE_IMAGE)
input.add("sample.png")

# Extract text from image
result = recognitionEngine.recognize(input)
# Display the recognition result
print(result[0].recognition_text)

Choose your preference

Choose the right library for your needs. Explore available APIs and their capabilities to select the most efficient solution.

Python via .NET

Easy development, readability, and maintainability of the code

Boasts the most features and receives the most frequent updates

The overall speed may be a bit slower than other platforms

Python via Java

Use the same library on any platform

Seamlessly run your application on any device

Requires the Java Runtime Environment (JRE) version 8 or later

Python via C++

The fastest possible speed regardless of the platform

A great deal of control over resource management

Targeted towards experienced developers

Runs everywhere

Regardless of the name, Aspose.OCR for Python via .NET does not require .NET to be installed on the target platform. The installation package already comes with all required components and can seamlessly operate on any platform – be it a local machine, web server, or the cloud.

Supported file formats

Aspose.OCR for Python via .NET can work with any file you can get from a scanner or camera. Recognition results can be saved, imported to a database, or analyzed in real time.

Images

JPEG
PNG
TIFF
BMP
GIF

Batch OCR

Multi-page PDF
DjVu
ZIP
Folder

Recognition results

Text
PDF
Microsoft Word
Microsoft Excel
HTML
RTF
ePub
JSON
XML

Installation

Aspose.OCR for Python via .NET is delivered as a Python package with minimal dependencies or as a downloadable file or PyPI package . Easily install it into your project, and you’re ready to recognize texts in 140+ languages and save recognition results in various formats.

Request a trial license to kickstart the development of a fully functional OCR application without limitations.

OCR under Python

Our library integrates easily, enabling Python applications to run seamlessly on any platform – desktop Windows, Windows Server, macOS, Linux, and the cloud.

140+ Recognition Languages

Our Java OCR library is a universal solution for document processing, data extraction, and content digitization on a global scale. With support for a vast array of European, Middle-East and Asian writing scripts, it is well-adapted for any country and business.

Aspose OCR for Java recognizes text in multilingual documents, such as Chinese/English, Arabic/French, or Cyrillic/English. The following languages are supported:

Extended Latin: English, Spanish, French, Indonesian, Portuguese, German, Vietnamese, Turkish, Italian, Polish, and 80+ more;
Cyrillic alphabet: Russian, Ukrainian, Kazakh, Bulgarian, including mixed Cyrillic/English texts;
Arabic, Persian, Urdu, including texts mixed with English;
Chinese, Korean, Japanese, Devanagari, and Dravidian languages, including Hindi, Tamil, Marathi, and others. Mixed-language texts are also supported.

Powerful processing filters

The accuracy and reliability of optical character recognition is highly dependent on the quality of the original image. Aspose OCR for Python via .NET offers a large number of fully automated and manual image processing filters that enhance an image before it is sent to the OCR engine:

Automatically rotate upside-down and rotated images.
Detect inverted images and extract white-on-black text.
Automatically remove dirt, spots, scratches, glare, unwanted gradients, and other noise.
Automatically adjust the image contrast.
Automatically upscale, or manually resize the image.
Convert images to black and white or grayscale.
Find potentially problematic areas of image and return the information on the type of defect and its coordinates.
Increase the thickness of characters in an image.
Blur noisy images while preserving the edges of letters.
Straighten page curvature and fix camera lens distortion for page photos.

Optimized for specific document types

Aspose OCR for Python via .NET offers specially trained neural networks to extract text from certain types of images with maximum accuracy:

Built-in spell checker

Although our Python OCR library provides high recognition accuracy, printing defects, dirt, or non-standard fonts may cause certain characters or words to be recognized incorrectly. To further improve recognition results, you can turn on spell checker, which finds and automatically corrects spelling errors based on the selected recognition language.

If the recognized text contains specialized terminology, abbreviations, and other words which are not present in common spelling dictionaries, you can provide your own word lists.

Batch recognition

Our Python OCR API liberates you from recognizing images one by one. Employ various batch-processing methods to recognize multiple images in one call:

Recognition of multi-page PDF, TIFF, and DjVu files.
Recognition of all files in a folder.
Recognition of all files in an archive.
Recognition of all files from a list.

Learning by sample

OCR for Python provides an array of examples written in Python, allowing you to quickly acquaint yourself with its functions and capabilities. Gain insights for creating tailored solutions to meet your Python business needs.

Features and capabilities

Aspose.OCR for Python via .NET solves your tasks fast and easy.

Photo OCR

Extract text from smartphone photos with scan-level accuracy.

Searchable PDF

Convert any scan into a fully searchable, indexable and editable document.

URL recognition

Recognize an image from URL without downloading it locally.

Bulk recognition

Read all images from multi-page documents, folders and archives.

Any font and style

Identify and recognize text in all popular typefaces and styles.

Fine-tune recognition

Adjust every OCR parameter for best recognition results.

Spell checker

Improve results by automatically correcting misspelled words.

Find text in images

Search for text or regular expression within a set of images.

Compare image texts

Compare texts on two images, regardless of the case and layout.

Python code samples

Delve into code samples to seamlessly integrate OCR into your Python applications.

Installation

Distributed as a Python Wheel or self-contained downloadable package, Aspose.OCR for Python via .NET is easily distributed. Integration into your Python project, directly from your preferred Python Integrated Development Environment (IDE), is a seamless process. Simply install it, and you’re prepared to leverage the complete range of OCR capabilities, saving recognition results in various formats.

Post-installation, you can promptly commence using Aspose.OCR for Python via .NET, albeit with certain limitations. A temporary license lifts all trial version restrictions for 30 days. Utilize this period to initiate the development of a fully functional OCR application, allowing you to make an informed decision on purchasing Aspose.OCR for Python via .NET at a later stage.

Load license

lic = License()
lic.set_license(self.licPath)

Recognize text on Photos

Reading text from any content in Aspose OCR for Python is as easy as calling a universal recognition method.

Convert photo to text - Python

api = AsposeOcr()
# Add image to the recognition batch
input = OcrInput(InputType.SINGLE_IMAGE)
input.add("source1.png")

# Set recognition language
recognitionSettings = RecognitionSettings()
recognitionSettings.language = Language.UKR;

# Recognize the image
results = api.recognize(input, recognitionSettings)

# Print recognition result
for result in results:
print(result.recognition_text)

Python Universal Converter

Our API adeptly reads any image from scanners, cameras, or smartphones: PDF documents, JPEG, PNG, TIFF, GIF, BMP images, and even DjVu files. Full support for multi-page PDF documents, TIFF, and DjVu images ensures versatility. You can also provide an image from the web via a URL.

Recognition results are returned in popular document and data exchange formats: plain text, PDF, Microsoft Word, Microsoft Excel, JSON, and XML.

Recognize PDF and Save Results to various output formats - Python

api = AsposeOcr()

# Create OcrInput and add images
input = OcrInput(aspose.ocr.models.InputType.PDF)
file = os.path.join(self.dataDir, "pdfs/multi_page_1.pdf")
input.add(file, 0, 3)

set = RecognitionSettings()
set.set_detect_areas_mode = DetectAreasMode.NONE
result = api.recognize(input, set)
api.save_multipage_document("test.xml", SaveFormat.XML, result)
api.save_multipage_document("test.json", SaveFormat.JSON, result)
api.save_multipage_document("test.pdf", SaveFormat.PDF, result)
api.save_multipage_document("test.xlsx", SaveFormat.XLSX, result)
api.save_multipage_document("test.docx", SaveFormat.DOCX, result)
api.save_multipage_document("test.txt", SaveFormat.TEXT, result)
api.save_multipage_document("test.html", SaveFormat.HTML, result)
api.save_multipage_document("test.epub", SaveFormat.EPUB, result)
api.save_multipage_document("test.rtf", SaveFormat.RTF, result)

Resource Optimization in Python

Optical character recognition demands resources. Our API offers flexible ways to balance the classic time-price-quality triad:

Choose between thorough recognition and fast recognition.
Specify the number of threads allocated for recognition, or allow the library to automatically scale to the number of processor cores.
Free up the CPU by offloading calculations to the .NET backend.

Fast Recognition - Python

api = AsposeOcr()

# Create OcrInput and add images
input = OcrInput(InputType.SINGLE_IMAGE)
input.add("sample_line.png")

result = api.recognize_fast(input)

Recognize single line

If your image is already trimmed to a single line of text, it can be recognized in the fastest possible mode, without automated corrections, content structure detection, and other resource-consuming steps. It can speed up OCR up to 7 times faster than normal recognition process.

Recognize single line of text on image - Python

api = AsposeOcr()

# Create OcrInput and add images
input = OcrInput(InputType.SINGLE_IMAGE)
input.add("sample_line.png")

# recognize without regions detection
settings = RecognitionSettings()
settings.recognize_single_line = True

result = api.recognize(input, settings)

print(result[0].recognition_text)

Python OCR Library

Convert images into text with Python OCR

Why Aspose.OCR for Python via .NET?

Efficient and precise OCR

Multilingual

Universal

Asian languages

Preserve layout

Live code sample

Convert image to text

Choose your preference

Versatility

Uniformity

Performance

Runs everywhere

Supported file formats

Images

Batch OCR

Recognition results

Installation

OCR under Python

140+ Recognition Languages

Powerful processing filters

Optimized for specific document types

Built-in spell checker

Batch recognition

Learning by sample

Features and capabilities

Python code samples

Installation

Load license

Recognize text on Photos

Convert photo to text - Python

Python Universal Converter

Recognize PDF and Save Results to various output formats - Python

Resource Optimization in Python

Fast Recognition - Python

Recognize single line

Recognize single line of text on image - Python

Ready, Set, Go!

Support and learning

Explore the API

Get support

Releases