HTML
PDF
OCR
XML
BMP
OCR
Recognize multipage TIFF in C#
Recognize multipage TIFF using Aspose.OCR for .NET library.
How to Recognize multipage TIFF using C#
To recognize a TIFF file, you need to set the InputType as TIFF. You can also specify the start page and the total number of pages to be recognized. Pages are indexed starting from 0. By default, you will receive the recognition results for all pages in the document. Please note that this process may take a considerable amount of time.
processes multipages TIFF. To run the examples, you just need to download the Aspose.OCR
tools with the following link:
Download Comman Line Tools
or run the example project in IDE: RecognizeMultipageTiff project
Run program in Command Prompt
RecognizeMultipageTiff
or
Run program in Command Prompt if you want to use own TIFF file, start page and pages count
RecognizeMultipageTiff folder/image.pdf 0 2
This sample code shows how to get areas coordinates
// Set the license file
//License lic = new License();
//lic.SetLicense("Aspose.Total.lic");
// Create AsposeOcr instance.
// You can use the overloaded constructor to set characters restriction.
AsposeOcr api = new AsposeOcr();
// Create OcrInput object to containerize images
// Add filters as you need
// PreprocessingFilter filters = new PreprocessingFilter // we automaticaly preprocess your image, but if your recognition result still bad, you can set up the set of filters by your own
// {
// PreprocessingFilter.Dilate()
// },
OcrInput input = new OcrInput(InputType.TIFF/*, filters*/);
input.Add(fileName, pageStart, pageCount);
// Set the options for recognition - start page and the pages number
var res = api.Recognize(input, new RecognitionSettings
{
//// allowed options
// AllowedCharacters = CharactersAllowedType.LATIN_ALPHABET, // ignore not latin symbols
// AutoContrast = false, // use Contrast correction filter before recognition - good for images with noice
// AutoSkew = true, // switch off if your image not rotated
// DetectAreas = true, // switch off if your image has a simple document structure (one column text without pictures)
// DetectAreasMode = DetectAreasMode.DOCUMENT, // depends on the structure of your image
// IgnoredCharacters = "*-!@#$%^&", // define the symbols you want to ignore in the recognition result
// Language = Language.Eng, // we support 26 languages
// LinesFiltration = false, // this works slowly, so choose it only if your picture has lines and it they bad detected in TABLE ar DOCUMENT DetectAreasMode
// ThreadsCount = 1, // by default our API use all you threads. But you can run it in one thread. Simply set up this here
// ThresholdValue = 150 // if you want to binarize image with your own threashold value, you can set up this here (from 1 to 255)
});
Console.WriteLine("RESULT");
Console.ResetColor();
Console.WriteLine("------------------------------------------------------------------------------");
for (int i = 0; i < res.Count; i++)
{
Console.WriteLine("------------------------------------------------------------------------------");
Console.WriteLine($"PAGE {i + 1} skew {res[i].Skew}");
Console.WriteLine("------------------------------------------------------------------------------");
Console.WriteLine(res[i].RecognitionText);
Console.WriteLine("------------------------------------------------------------------------------");
// you can print here additional information and spell-check the result
// also you can save each page result in your prefered file format
// res[i].Save(...);
// or convert your result to json or xml string
// res[i].GetJson();
// res[i].GetXml();
}
// you can also save result as one multipage document
// AsposeOcr.SaveMultipageDocument("result.pdf", SaveFormat.Pdf, res);
Other Supported Tools
Using C#, one can easily run our examples.
Recognize image (GIF, PNG, JPEG, BMP, TIFF, JFIF)
Recognize PDF (Scanned PDF)
Recognize TIFF (Multipage TIFF)
Preprocess image (GIF, PNG, JPEG, BMP, TIFF, JFIF)
Recognize ZIP archive (ZIP)
Get JSON (GIF, PNG, JPEG, BMP, TIFF, JFIF)
Get XLSX (GIF, PNG, JPEG, BMP, TIFF, JFIF)
Detect angle (GIF, PNG, JPEG, BMP, TIFF, JFIF)
Recognize image from URL (URL with GIF, PNG, JPEG, BMP, TIFF, JFIF)
Text areas detection (GIF, PNG, JPEG, BMP, TIFF, JFIF)