NavigationSimilar entriesSponsors |
Optical Character Recognition Through the Use of a Kohonen Neural Network
Brian Vuyk, student at Redeemer University College
April 12, 2006 Optical Character Recognition (OCR) is a field in computer science, which throughout the latter half of the 20th century received much attention from the scientific community. The ability of a computer to read a series of characters, determine their meaning, and take appropriate action based on the interpreted meaning was a goal sought after, due in part to the high value of the commercial applications of these techniques. For example, OCR was deemed to be of great value within the postal system, in which typically millions of envelopes must be read and sorted daily, to determine their delivery location. There was also great demand in the corporate world to have the ability to digitize older documents, so that they may be more easily read and modified. Within libraries and other educational institutions, there existed a desire to digitize old books, in order to better store their content. Another use of OCR is to determine the handwriting impressed with a stylus in many personal planners and PDAs.1 The usefulness of any OCR method is directly related to it’s performance. If the OCR document has been correctly recognized with few errors, it can vastly reduce any time required by humans to correct and perfect the document contents. OCR performance is typically measured in terms of unrecognized character, substitution errors, insertion errors, or deletion errors. An unrecognized character is considered any character which cannot be determined by the OCR engine, for which a standard placeholder is inserted. A substitution error is when a character is misrecognized, and an incorrect character is substituted instead. And insertion error is caused when when an extra character is added to a word, such as a ‘w’ decomposing into ‘vv’. A deletion error is caused when a character is missed in the recognization process. 2
|
User login |