Optical character recognition

Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example: from a television broadcast).^[1]

Widely used as a form of data entry from printed paper data records – whether passport documents, invoices, bank statements, computerized receipts, business cards, mail, printed data, or any suitable documentation – it is a common method of digitizing printed texts so that they can be electronically edited, searched, stored more compactly, displayed online, and used in machine processes such as cognitive computing, machine translation, (extracted) text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.

Early versions needed to be trained with images of each character, and worked on one font at a time. Advanced systems capable of producing a high degree of accuracy for most fonts are now common, and with support for a variety of image file format inputs.^[2] Some systems are capable of reproducing formatted output that closely approximates the original page including images, columns, and other non-textual components.

for business documents, e.g. checks, passports, invoices, bank statements and receipts

Entering data

Automatic number-plate recognition

Passport recognition and in airports

information extraction

Automatically extracting key information from insurance documents

^[9]

Traffic-sign recognition

Extracting business card information into a contact list

[10]

Creating textual versions of printed documents, e.g. for Project Gutenberg

book scanning

Making electronic images of printed documents searchable, e.g.

Google Books

Converting handwriting in real-time to control a computer ()

pen computing

Defeating or testing the robustness of anti-bot systems, though these are specifically designed to prevent OCR.^[11]^[12]^[13]

CAPTCHA

Assistive technology for blind and visually impaired users

Writing instructions for vehicles by identifying CAD images in a database that are appropriate to the vehicle design as it changes in real time

Making scanned documents searchable by converting them to PDFs

OCR engines have been developed into software applications specializing in various subjects such as receipts, invoices, checks, and legal billing documents.

The software can be used for:

Optical character recognition (OCR) – targets typewritten text, one or character at a time.

glyph

Optical word recognition – targets typewritten text, one word at a time (for languages that use a as a word divider). Usually just called "OCR".

space

(ICR) – also targets handwritten printscript or cursive text one glyph or character at a time, usually involving machine learning.

Intelligent character recognition

(IWR) – also targets handwritten printscript or cursive text, one word at a time. This is especially useful for languages where glyphs are not separated in cursive script.

Intelligent word recognition

OCR is generally an offline process, which analyses a static document. There are cloud based services which provide an online OCR API service. Handwriting movement analysis can be used as input to handwriting recognition.^[14] Instead of merely using the shapes of glyphs and words, this technique is able to capture motion, such as the order in which segments are drawn, the direction, and the pattern of putting the pen down and lifting it. This additional information can make the process more accurate. This technology is also known as "online character recognition", "dynamic character recognition", "real-time character recognition", and "intelligent character recognition".

Techniques[edit]

Pre-processing[edit]

OCR software often pre-processes images to improve the chances of successful recognition. Techniques include:^[15]

Optical Character Recognition in Unicode