Why Aspose.OCR for .NET?

Aspose.OCR for .NET is a powerful yet easy-to-use and cost-effective API for optical character recognition. With it, you can add OCR functionality to your .NET applications in less than 5 lines of code without worrying about complex math, neural networks, and other technical details. Our experience in machine learning technologies and years of development resulted in an OCR engine with superior speed and accuracy that supports 27 languages based on Latin and Cyrillic scrips as well as Chinese. OCR API can recognize scanned images, smartphone photos, screenshots, areas of images, and scanned PDFs and return results in the most popular document and data exchange formats. Various pre-processing filters allow you to recognize rotated, skewed and noisy images. Recognition performance and system load can be further improved by transferring of resource intensive computational tasks to the GPU.

Illustration ocr

Fast and Accurate OCR

Achieve high-speed and accurate OCR results with our advanced technology.

Language Support

Recognize text in 27 languages, including Latin, Cyrillic, and Chinese scripts.

Versatile Image Support

Process images from various sources, such as scanners, cameras, and smartphones.

Chinese Character Recognition

Recognize more than 6,000 Chinese characters with precision.

Font Styles and Formatting

Preserve font styles and formatting for accurate representation of recognized text.

Easy to Use

You need three lines of code to recognize the image and display the result. Yes, it really is that simple!

Ready to recognize Ready to recognize Drop a file here or click to browse *

* By uploading your files or using the service you agree with our Terms of use and Privacy Policy.

Recognition result
 

Explore the simplicity of OCR processing with our live sample.

More examples >
// Initialize OCR engine
var recognitionEngine = new Aspose.OCR.AsposeOcr();
// Add image to the recognition batch
var source = new Aspose.OCR.OcrInput(Aspose.OCR.InputType.SingleImage);
source.Add("<file name>");

// Perform OCR
List<Aspose.OCR.RecognitionResult> results
     = recognitionEngine.Recognize(source);
// Output recognized text
Console.WriteLine(results[0].RecognitionText);

Platform independence

Aspose.OMR for .NET can work on any platform that supports .NET Framework 4.0 and later - whether on a local machine, on the web server, or in cloud.

Microsoft Windows
Linux
MacOS
GitHub
Microsoft Azure
Amazon Web Services
Docker

Supported file formats

Aspose.OCR for .NET can work with virtually any file [TBD]

Source files

  • JPEG
  • PNG
  • TIFF
  • BMP
  • GIF

Recognition results

  • Multi-page PDF
  • DjVu
  • ZIP
  • Folder

[TBD]

  • Text
  • PDF
  • Microsoft Word
  • Microsoft Excel
  • HTML
  • RTF
  • ePub
  • JSON
  • XML

Easy to Install

Aspose.OCR for .NET is distributed as a lightweight NuGet package or as a downloadable file with minimal dependencies. Simply install it to your project and you are ready to recognize texts in any supported languages and save recognition results in any of the supported formats.

Request a temporary license to start building a fully functional OCR application without any limits and restrictions.

Cross-Platform

The library fully supports .NET Standard 2.0. It means the applications can run on any platform: desktop Windows, Windows Server, macOS, Linux, and cloud.

  • {index-content-net.all_texts.text2.b_li_1}
  • {index-content-net.all_texts.text2.b_li_2}
  • {index-content-net.all_texts.text2.b_li_3}

27 Recognition Languages

OCR API can recognize a large number of languages and all popular writing scripts, including texts with mixed languages:

You can leave the language detection to the library or define the language yourself to increase the recognition performance and reliability.

  • Extended Latin alphabet: Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Italian, Latvian, Lithuanian, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish.
  • Cyrillic alphabet: Belorussian, Bulgarian, Kazakh, Russian, Serbian, Ukrainian.
  • Chinese: more than 6,000 characters.
  • {index-content-net.all_texts.text3.b_li_4}

Features and capabilities

Explore the advanced features and capabilities of Aspose.OCR for .NET.

Feature icon

Converts images and PDFs to text

Utilize OCR to convert images and PDFs into editable text.

Feature icon

Supports all image formats

Process images in any format, including JPEG, PNG, TIFF, GIF, and BMP.

Feature icon

Reads languages based on Latin and Cyrillic

Accurately read languages using Latin and Cyrillic scripts.

Feature icon

Recognizes more than 6,000 Chinese characters

Achieve precise recognition of a vast number of Chinese characters.

Feature icon

Detects and recognizes all popular typefaces

Identify and recognize text in various popular typefaces.

Feature icon

Carefully preserves font styles and formatting

Retain font styles and formatting for accurate representation of text.

Feature icon

Processes the whole image or selected areas only

Choose between processing the entire image or specific areas for OCR.

Feature icon

Supports rotated, skewed and noisy images

Accurately recognize text in images with various challenges, including rotation and noise.

Feature icon

Batch recognition of all images in a folder or archive

Efficiently process multiple images in batch mode for increased productivity.

Code samples

Explore the code samples to learn how to integrate Aspose.OCR for .NET into your applications.

Installation

As a lightweight NuGet package or a downloadable file with minimal dependencies, Aspose.OCR for .NET is easily distributed. Integration into your project, directly from Microsoft Visual Studio, is a seamless process. Simply install it, and you’re prepared to leverage the complete range of OCR capabilities, saving recognition results in any of the supported formats.

Post-installation, you can promptly commence using Aspose.OMR for .NET, albeit with certain limitations. A temporary license lifts all trial version restrictions for 30 days. Utilize this period to initiate the development of a fully functional OCR application, allowing you to make an informed decision on purchasing Aspose.OCR for .NET at a later stage. {index-content-net.code_samples.item1.content2}

Recognize Photos

The biggest barrier to OCR applications is that scanners are not commonplace for end users. The API has powerful built-in image pre-processing filters that can handle rotated, skewed, and noisy images. In combination with support for all image formats, it allows for reliable recognition of even smartphone photos. Most of the pre-processing and image correction is done automatically, so you will only have to intervene in difficult cases. {index-content-net.code_samples.item2.content2}

Apply automatic image corrections - C#

// Configure preprocessing filters
PreprocessingFilter filters = new PreprocessingFilter {
  PreprocessingFilter.ContrastCorrectionFilter(),
  PreprocessingFilter.AutoDewarping()
};

// Add a photo for recognition
OcrInput photos = new OcrInput(InputType.SingleImage, filters);
photos.Add("photo.png");

// Fine-tune recognition setings
RecognitionSettings settings = new RecognitionSettings();
settings.Language = Language.Eng;
settings.DetectAreasMode = DetectAreasMode.CURVED_TEXT;

// Extract text from a page
AsposeOcr api = new AsposeOcr();
List<RecognitionResult> results = api.Recognize(photos, settings);

// Automatically correct spelling (English)
string text = results[0].GetSpellCheckCorrectedText(SpellCheckLanguage.Eng);
// Display recognized text
Console.WriteLine(text);

Universal converter

The API can read literally any image you can get from a scanner, camera or smartphone: PDF documents, JPEG, PNG, TIFF, GIF, BMP images, and even DjVu files. Multi-page PDF documents, TIFF and DjVu images are fully supported. You can also provide an image from the web via a URL.

Recognition results are returned in the most popular document and data exchange formats: plain text, PDF, Microsoft Word, Microsoft Excel, JSON, and XML.

Recognize PDF and save results to JSON - C#

// Load the scanned PDF
OcrInput pdf = new OcrInput(InputType.PDF);
pdf.Add("Delivery-Agreement.pdf");

// Recognize the text from document
AsposeOcr api = new AsposeOcr();
List<RecognitionResult> result = api.Recognize(pdf);

// Save searchable PDF
AsposeOcr.SaveMultipageDocument("Readable-Contract.pdf", SaveFormat.Pdf, result);
// Report progress
Console.WriteLine($@"Recognition finished. See '{Directory.GetCurrentDirectory()}\Readable-Contract.pdf'.");

Resource Optimization

Optical character recognition is a resource-intensive process. The API offers very flexible ways to strike a balance in the classic time-price-quality triad: {index-content-net.code_samples.item4.content2}

Fast recognition - C#

string sourceFolder = "images";
string searchFor = "OCR";

// Search for text in images
AsposeOcr api = new AsposeOcr();
foreach(var image in Directory.GetFiles(sourceFolder,"*.png"))
{
  bool found = api.ImageHasText(image, searchFor);
  if(found) Console.WriteLine($@"Found ""{searchFor}"" in image ""{image}""");
}