The Image OCR control lets you the application developer add optical character recognition functions to applications that support 32-bit ActiveX controls.
The OCR process analyzes an image and identifies text, pictures, and layout. This information is used to reconstruct the document.
You can design your program to process a document automatically or manually. For manual processing, you would prompt the user for layout details (such as single or multiple columns; text, pictures, or both) of the source document. If TIFF images are used, you can provide an option to define text or picture regions of the document with a selection rectangle. This can be useful for processing a single paragraph in the document.
User-defined dictionaries can be used to complement the standard OCR dictionaries, thereby improving word recognition.
The OCR process can be trained to recognize special characters or words. You can display a dialog box that lists words or symbols that the program is unsure of, and enable the user to provide corrections. The corrections and other information are saved in a training file that can be reused.
Depending on how you design and code your application, the Image OCR control lets your users convert scanned image documents to formatted, editable text.
Users can convert image documents to one of the following formats: