Why Opt for Aspose.OCR for Python via .NET?

Embark on a journey with Aspose.OCR for Python via .NET – a versatile and user-friendly OCR API. Embed OCR functionality into your Python applications with fewer than 5 lines of code, eliminating the need for complex math or neural networks. Our powerful OCR engine delivers unparalleled speed and accuracy, supporting more than 130 languages, including Latin, Cyrillic, Arabic, Persian, Indic, and Chinese scripts. Whether it’s scanned images, smartphone photos, screenshots, or scanned PDFs, obtain results in popular document and data exchange formats. Leverage pre-processing filters to handle rotated, skewed, and noisy images. Optimize recognition performance and system load by offloading resource-intensive tasks to the .NET backend.

Illustration ocr

Efficient and Precise OCR Mastery

Experience unparalleled speed and precision in OCR results with advanced Python and .NET technology.

Multilingual Excellence

Recognize text in 130+ languages, spanning Latin, Cyrillic, and Chinese scripts with ease and precision.

Adaptable Image Processing

Effortlessly process images from diverse sources – scanners, cameras, and smartphones – using Python and .NET.

Chinese Character Proficiency

Achieve precise recognition of over 6,000 Chinese characters, ensuring accuracy with Python and .NET.

Font Styles and Formats Preservation

Maintain font styles and formatting for accurate text representation, enhancing versatility with Python and .NET.

Live code sample

Experience simplicity: Convert an image to text in just three lines of Python code!

Ready to recognize Ready to recognize Drop a file here or click to browse *

* By uploading your files or using the service you agree with our Terms of use and Privacy Policy.

Recognition result
 

Convert image to text

Discover More Examples >
# Initialize OCR engine
recognitionEngine = AsposeOcr()

# Add image to batch
input = OcrInput(InputType.SINGLE_IMAGE)
input.add("sample.png")

# Extract text from image
result = recognitionEngine.recognize(input)
# Display the recognition result
print(result[0].recognition_text)

Choose your preference

Choose the right library for your needs. Explore available APIs and their capabilities to select the most efficient solution.

Versatility

Python via .NET

Easy development, readability, and maintainability of the code
Boasts the most features and receives the most frequent updates
The overall speed may be a bit slower than other platforms

Uniformity

Python via Java

Use the same library on any platform
Seamlessly run your application on any device
Requires the Java Runtime Environment (JRE) version 8 or later

Performance

Python via C++

The fastest possible speed regardless of the platform
A great deal of control over resource management
Targeted towards experienced developers

.NET Empowerment for Python in Every Corner

Aspose.OCR for Python via .NET seamlessly operates on any platform supporting .NET Framework 4.0 and later – be it a local machine, web server, or the cloud.

Microsoft Windows
Linux
MacOS
GitHub
Microsoft Azure
Amazon Web Services
Docker

Supported file formats

Aspose.OCR for Python via .NET can work with virtually any file you can get from a scanner or camera. Recognition results are returned in the most popular file and data exchange formats that can be saved, imported to a database, or analyzed in real time.

Images

  • JPEG
  • PNG
  • TIFF
  • BMP
  • GIF

Batch OCR

  • Multi-page PDF
  • DjVu
  • ZIP
  • Folder

Recognition results

  • Text
  • PDF
  • Microsoft Word
  • Microsoft Excel
  • HTML
  • RTF
  • ePub
  • JSON
  • XML

Installation for Python with .NET backend

Aspose.OCR for Python via .NET is delivered as a Python package with minimal dependencies or as a downloadable file . Easily install it into your project, and you’re ready to recognize texts in multiple supported languages and save recognition results in various formats.

Request a trial license to kickstart the development of a fully functional OCR application without limitations.

Powerful OCR for Python Applications

Our library integrates easily, enabling Python applications to run seamlessly on any platform – desktop Windows, Windows Server, macOS, Linux, and the cloud.

130+ Recognition Languages

Our Python and .NET OCR API recognizes a plethora of languages and popular writing scripts, including mixed languages:

Leave language detection to the library or define the language yourself for enhanced recognition performance and reliability.

  • Extended Latin alphabet: English, Spanish, French, Indonesian, Portuguese, German, Vietnamese, Turkish, Italian, Polish, and 80+ more;
  • Cyrillic alphabet: Russian, Ukrainian, Kazakh, Serbian, Belarusan, Bulgarian;
  • Arabic, Persian, Urdu;
  • Chinese and Devanagari script, including Hindi, Marathi, Bhojpuri, and others.

Powerful processing filters

The accuracy and reliability of optical character recognition is highly dependent on the quality of the original image. Aspose.OCR for Python via .NET offers a large number of fully automated and manual image processing filters that enhance an image before it is sent to the OCR engine:

  • Automatically straighten images aligned at a slight angle to the horizontal.
  • Manually rotate severely skewed images.
  • Automatically remove dirt, spots, scratches, glare, unwanted gradients, and other noise.
  • Automatically adjust the image contrast.
  • Automatically upscale, or manually resize the image.
  • Convert images to black and white or grayscale.
  • Invert image colors so that light areas appear dark and dark areas appear light.
  • Increase the thickness of characters in an image.
  • Blur noisy images while preserving the edges of letters.
  • Straighten page curvature and fix camera lens distortion for page photos.

Optimized for specific document types

Aspose.OCR for Python via .NET offers specially trained neural networks to extract text from certain types of images with maximum accuracy:

Built-in spell checker

Although Aspose.OCR for Python via .NET provides high recognition accuracy, printing defects, dirt, or non-standard fonts may cause certain characters or words to be recognized incorrectly. To further improve recognition results, you can turn on spell checker, which finds and automatically corrects spelling errors based on the selected recognition language.

If the recognized text contains specialized terminology, abbreviations, and other words which are not present in common spelling dictionaries, you can provide your own word lists.

Batch Recognition Simplified

Our Python OCR API liberates you from recognizing images one by one. Employ various batch-processing methods to recognize multiple images in one call:

  • Recognition of multi-page PDF, TIFF, and DjVu files.
  • Recognition of all files in a folder.
  • Recognition of all files in an archive.
  • Recognition of all files from a list.

Learning Through Python Examples

Aspose.OCR for Python via .NET provides an array of examples written in Python, allowing you to quickly acquaint yourself with its functions and capabilities. Gain insights for creating tailored solutions to meet your Python business needs.

Features and Capabilities

Aspose.OCR for Python via .NET Explore the advanced capabilities of Aspose.OCR for C++.

Feature icon

Photo OCR

Extract text from smartphone photos with scan-level accuracy.

Feature icon

Searchable PDF

Convert any scan into a fully searchable and indexable document.

Feature icon

URL recognition

Recognize an image from URL without downloading it locally.

Feature icon

Bulk recognition

Read all images from multi-page documents, folders and archives.

Feature icon

Any font and style

Identify and recognize text in all popular typefaces and styles.

Feature icon

Fine-tune recognition

Adjust every OCR parameter for best recognition results.

Feature icon

Spell checker

Improve results by automatically correcting misspelled words.

Feature icon

Find text in images

Search for text or regular expression within a set of images.

Feature icon

Compare image texts

Compare texts on two images, regardless of the case and layout.

Python Code Samples

Delve into code samples to seamlessly integrate Aspose.OCR for Python via .NET into your Python applications.

Installation Mastery in Python

Distributed as a Python Wheel or self-contained downloadable package, Aspose.OCR for Python via .NET is easily distributed. Integration into your Python project, directly from your preferred Python Integrated Development Environment (IDE), is a seamless process. Simply install it, and you’re prepared to leverage the complete range of OCR capabilities, saving recognition results in various formats.

Post-installation, you can promptly commence using Aspose.OCR for Python via .NET, albeit with certain limitations. A temporary license lifts all trial version restrictions for 30 days. Utilize this period to initiate the development of a fully functional OCR application, allowing you to make an informed decision on purchasing Aspose.OCR for Python via .NET at a later stage.

Load license

lic = License()
lic.set_license(self.licPath)

Recognize text on Photos

Reading text from any content in Aspose.OCR for Python via .NET is as easy as calling a universal recognize() method.

Convert photo to text - Python

api = AsposeOcr()
# Add image to the recognition batch
input = OcrInput(InputType.SINGLE_IMAGE)
input.add("source1.png")

# Set recognition language
recognitionSettings = RecognitionSettings()
recognitionSettings.language = Language.UKR;

# Recognize the image
results = api.recognize(input, recognitionSettings)

# Print recognition result
for result in results:
print(result.recognition_text)

Python Universal Converter

Our API adeptly reads any image from scanners, cameras, or smartphones: PDF documents, JPEG, PNG, TIFF, GIF, BMP images, and even DjVu files. Full support for multi-page PDF documents, TIFF, and DjVu images ensures versatility. You can also provide an image from the web via a URL.

Recognition results are returned in popular document and data exchange formats: plain text, PDF, Microsoft Word, Microsoft Excel, JSON, and XML.

Recognize PDF and Save Results to various output formats - Python

api = AsposeOcr()

# Create OcrInput and add images
input = OcrInput(aspose.ocr.models.InputType.PDF)
file = os.path.join(self.dataDir, "pdfs/multi_page_1.pdf")
input.add(file, 0, 3)

set = RecognitionSettings()
set.set_detect_areas_mode = DetectAreasMode.NONE
result = api.recognize(input, set)
api.save_multipage_document("test.xml", SaveFormat.XML, result)
api.save_multipage_document("test.json", SaveFormat.JSON, result)
api.save_multipage_document("test.pdf", SaveFormat.PDF, result)
api.save_multipage_document("test.xlsx", SaveFormat.XLSX, result)
api.save_multipage_document("test.docx", SaveFormat.DOCX, result)
api.save_multipage_document("test.txt", SaveFormat.TEXT, result)
api.save_multipage_document("test.html", SaveFormat.HTML, result)
api.save_multipage_document("test.epub", SaveFormat.EPUB, result)
api.save_multipage_document("test.rtf", SaveFormat.RTF, result)

Resource Optimization in Python

Optical character recognition demands resources. Our API offers flexible ways to balance the classic time-price-quality triad:

  • Choose between thorough recognition and fast recognition.
  • Specify the number of threads allocated for recognition, or allow the library to automatically scale to the number of processor cores.
  • Free up the CPU by offloading calculations to the .NET backend.

Fast Recognition - Python

api = AsposeOcr()

# Create OcrInput and add images
input = OcrInput(InputType.SINGLE_IMAGE)
input.add("sample_line.png")

result = api.recognize_fast(input)

Recognize single line

If your image is already trimmed to a single line of text, it can be recognized in the fastest possible mode, without automated corrections, content structure detection, and other resource-consuming steps. It can speed up OCR up to 7 times faster than normal recognition process.

Recognize single line of text on image - Python

api = AsposeOcr()

# Create OcrInput and add images
input = OcrInput(InputType.SINGLE_IMAGE)
input.add("sample_line.png")

# recognize without regions detection
settings = RecognitionSettings()
settings.recognize_single_line = True

result = api.recognize(input, settings)

print(result[0].recognition_text)