OCR for C++

OCR for Python via .NET

OCR for Python via Java

OCR for Python via C++

OCR for Javascript via C++

OCR for Node.js via C++

OCR for .NET

C# OCR library for .NET applications

.NET OCR library supporting 140+ recognition languages that extracts text from images and creates searchable PDFs with just a few lines of C# code.

Download Trial Purchase

See what’s new

// Initialize OCR engine
var recognitionEngine = new Aspose.OCR.AsposeOcr();
// Add image to the recognition batch
var source
     = new Aspose.OCR.OcrInput(Aspose.OCR.InputType.SingleImage);
source.Add("image-with-text.png");

// Perform OCR
List<Aspose.OCR.RecognitionResult> results
     = recognitionEngine.Recognize(source);
// Output recognized text
Console.WriteLine(results[0].RecognitionText);

> dotnet add package Aspose.OCR

Why choose Aspose OCR library?

Build powerful OCR capabilities into your .NET apps in seconds. Our easy-to-use OCR API lets you extract text from images and scans, create searchable PDFs, and more with minimal C# code. Ideal for .NET desktop, web, cloud, and serverless functions. Click the items below to learn more about our features and benefits.

Global OCR applications

C# OCR recognizes English, Cyrillic, Arabic, Persian, Chinese, Japanese, Korean, Hindi, Tamil, and mixed-language texts.

Read everything

Get text from any file obtained through a scanner or camera, and process images directly from web links.

Reliable results

Achieve high recognition accuracy for all images, including those that are out-of-focus, rotated, distorted, and noisy.

Batch recognition

Bulk-recognize all images from folders and archives; read multi-page PDF documents and TIFF images.

Layout detection

Identify and categorize content blocks in images to ensure the correct order of extracted text, regardless of layout.

Live code sample

.NET OCR becomes a trivial and straightforward task with Aspose OCR API, even for new developers. Just a few lines of code are enough to extract text from an image and display it on the screen. It really is that simple - give it a try.

Ready to recognize Ready to recognize Drop a file here or click to browse *

* By uploading your files or using the service you agree with our Terms of use and Privacy Policy.

Recognition result

// Initialize OCR engine
var recognitionEngine = new Aspose.OCR.AsposeOcr();
// Add image to the recognition batch
var source = new Aspose.OCR.OcrInput(Aspose.OCR.InputType.SingleImage);
source.Add("<file name>");

// Perform OCR
List<Aspose.OCR.RecognitionResult> results
     = recognitionEngine.Recognize(source);
// Output recognized text
Console.WriteLine(results[0].RecognitionText);

Platform independence

Cross-platform OCR library can work everywhere under .NET, .NET Core or .NET Framework - whether on a local machine, on the web server, or in cloud.

Supported file formats

Aspose.OCR for .NET can work with any file you can get from a scanner or camera. Recognition results can be saved, imported to a database, or analyzed in real time.

Images

JPEG
PNG
TIFF
BMP
GIF

Batch OCR

Multi-page PDF
DjVu
ZIP
Folder

Recognition results

Text
PDF
Microsoft Word
Microsoft Excel
HTML
RTF
ePub
JSON
XML

Suitable for any content

The accuracy and reliability of text recognition in C# depend largely on image quality. .NET OCR offers a full set of automated and manual image optimization, ensuring superior recognition results.

Powerful image processing, fully customizable text detection, post-processing, and automated spelling correction enable text extraction from any scan or photo with highest accuracy.

OCR resource optimization

Aspose’ C# OCR library enables highly flexible balancing of recognition speed, quality, and resource utilization for each specific use case:

Choose between thorough recognition and fast recognition.
Specify the number of threads allocated for recognition, or allow our .NET OCR library to automatically scale to the number of processor cores.
Free up the CPU by offloading the calculations to the GPU.

140+ recognition languages

Our C# OCR library is a universal solution for document processing, data extraction, and content digitization on a global scale. With support for a vast array of European, Middle-East and Asian writing scripts, it is well-adapted for any country and business.

You can recognize documents written in mixed languages, such as Chinese/English, Arabic/French or Cyrillic/English. The following languages are supported:

Extended Latin: English, Spanish, French, Indonesian, Portuguese, German, Vietnamese, Turkish, Italian, Polish, and 80+ more;
Cyrillic alphabet: Russian, Ukrainian, Kazakh, Bulgarian, including mixed Cyrillic/English texts;
Arabic, Persian, Urdu, including texts mixed with English;
Chinese, Korean, Japanese, Devanagari, and Dravidian languages, including Hindi, Tamil, Marathi, and others.

Features and capabilities

C# OCR automatically extracts text from photos or scans, eliminating the need for manual retyping of documents.

Photo OCR

Extract text from smartphone photos with scan-level accuracy.

Searchable PDF

Convert any scan into a fully searchable and indexable document.

URL recognition

Recognize an image from URL without downloading it locally.

Bulk recognition

Read all images from multi-page documents, folders and archives.

Any font and style

Identify and recognize text in all popular typefaces and styles.

Fine-tune recognition

Adjust every OCR parameter for best recognition results.

Spell checker

Improve results by automatically correcting misspelled words.

Find text in images

Search for text or regular expression within a set of images.

Compare image texts

Compare texts on two images, regardless of the case and layout.

Easy to use OCR

With our C# OCR API, you only need a few lines of C# code to convert image to text, create a searchable PDF, save recognition results to document, and many more. Explore the code samples to understand how to integrate our OCR API into your .NET solutions.

Installation

.NET OCR is distributed as a NuGet package or as a downloadable file with minimal dependencies. The package can be added to your project directly from Microsoft Visual Studio. Simply install it to your project and you are ready to extract text from images and save recognition results in any of the supported formats. If your system has a CUDA capable GPU, you can use the GPU-accelerated OCR engine to significantly increase recognition performance.

You can start using Aspose OCR for .NET right after the installation with some restrictions . A temporary license removes all limitations of the trial version for 30 days. Use it to start building a fully functional OCR application and make the final decision to purchase the OCR for .NET later.

Extract text from a photo

When people typically think of OCR (Optical Character Recognition), the first association is often with a scanner as the primary capture device. This association has historical reasons and is still prevalent in many contexts, providing consistent and controlled environment for capturing printed text from physical documents with unmatched quality. However, a scanner is specialized equipment that is not always at hand and requires a stationary workstation to operate. Fortunately, the modern world provides a convenient alternative to traditional scanners - a smartphone camera. The advancements in smartphone camera technology ensure that even an entry-level smartphone provides sufficient quality to capture OCR-ready documents. And built-in memory makes it easier than ever to digitize large quantities of documents, newspapers, books, street signs and other text on the go. All you need is the right technology to convert those photos into machine-readable text.

Our C# OCR library is specifically designed to recognize all types of images out-of-the-box and can be further fine-tuned to handle even the low-quality photos. Combined with a modern smartphone, it allows you to create powerful OCR applications for most everyday scanning and text recognition tasks. The most advanced image processing and document structure analysis are performed in a few lines of code, allowing you to focus on business rather than complex mathematical algorithms, neural networks and other technical intricacies.

Photo OCR - C#

// Configure preprocessing filters
PreprocessingFilter filters = new PreprocessingFilter {
  PreprocessingFilter.ContrastCorrectionFilter(),
  PreprocessingFilter.AutoDewarping()
};

// Add a photo for recognition
OcrInput photos = new OcrInput(InputType.SingleImage, filters);
photos.Add("photo.png");

// Fine-tune recognition setings
RecognitionSettings settings = new RecognitionSettings();
settings.Language = Language.Eng;
settings.DetectAreasMode = DetectAreasMode.CURVED_TEXT;

// Extract text from a page
AsposeOcr api = new AsposeOcr();
List<RecognitionResult> results = api.Recognize(photos, settings);

// Automatically correct spelling (English)
string text = results[0].GetSpellCheckCorrectedText(SpellCheckLanguage.Eng);
// Display recognized text
Console.WriteLine(text);

Create a searchable PDF from the scan

PDF is one of the most popular formats for scanning paper documents, especially due to its ability to combine multiple pages into a single file. This format is widely used for the exchange of contracts, invoices, legal documents, passports and ID cards, and many other documents between individuals, businesses, banks and government agencies. However, any scanned PDF is essentially a collection of images. It does not contain machine-readable text, so users cannot search, copy, or otherwise manipulate the document content.

Aspose .NET OCR offers you a fast, easy and highly reliable way to convert any scanned PDF into a fully searchable and indexable document. It accurately recognizes page content, converting it into a machine-readable text layer over the original image that can be selected, copied, read by text-to-speech software, and even automatically processed by translators, summarizers, and other AI-powered analytics tools.

Add text overlay to PDF - C#

// Load the scanned PDF
OcrInput pdf = new OcrInput(InputType.PDF);
pdf.Add("Delivery-Agreement.pdf");

// Recognize the text from document
AsposeOcr api = new AsposeOcr();
List<RecognitionResult> result = api.Recognize(pdf);

// Save searchable PDF
AsposeOcr.SaveMultipageDocument("Readable-Contract.pdf", SaveFormat.Pdf, result);
// Report progress
Console.WriteLine($@"Recognition finished. See '{Directory.GetCurrentDirectory()}\Readable-Contract.pdf'.");

Search for text in images

Digital archives, especially in large organizations, often consist of a vast collection of scans and photos, many of which may contain multi-page documents. Efficient management and organization of such archives effectively is essential for easy information retrieval and navigation. However, images do not contain machine-readable text, making it impossible to search and analyze document content.

The C# OCR library allows you to easily search for text in images, regardless of the font, text size, style, and other parameters. The library also supports case-insensitive searches and regular expressions, which be extremely useful in various applications and industries. This functionality can be used for categorizing documents based on the content, keywords, or patterns found in the text; searching for specific terms or clauses within agreements and contracts; reorganizing files based on keywords or content found within them; locate and identify personal data within documents, making it easier to ensure GDPR compliance and manage sensitive information more effectively. Searching withing images also allows for creating automated workflows and streamline various business processes upon receiving signed contracts and invoices.

Search for text in images - C#

string sourceFolder = "images";
string searchFor = "OCR";

// Search for text in images
AsposeOcr api = new AsposeOcr();
foreach(var image in Directory.GetFiles(sourceFolder,"*.png"))
{
  bool found = api.ImageHasText(image, searchFor);
  if(found) Console.WriteLine($@"Found ""{searchFor}"" in image ""{image}""");
}