Introduction to Optical Character Recognition Project:

The project is about Optical Character Recognition. It is a process of classifying optical patterns with respect to alphanumeric or other characters. Optical character recognition process includes segmentation, feature extraction and classification. 

Text capture converts Analog text based resources to digital text resources. And then these converted resources can be used in several ways like searchable text in indexes so as to identify documents or images. 

As the first stage of text capture a scanned image of a page is taken. And this scanned copy will form basis for all other stages. The very next stage involves implementation of technology Optical Character Recognition for converting text content into machine understandable or readable format. 

OCR analysis takes the input as digital image which is printed or hand written and converts it to machine readable digital text format. Then OCR processes the digital image into small components for analysis of finding text or word or character blocks. And again the character blocks are further broken into components and are compared with dictionary of characters. 

Matlab is an environment where problems and solutions can be denoted in terms of mathematical notations. A use of Matlab includes analysis, algorithm development, computation and much more. Matlab is a system where elements are placed in an array but are not required any dimensionless. It helps us to solve our problem in no time and provides an easy solution. 

The OCR text is written into a pure text file that is then imported again to a search engine. The text is used as index searching of the information. Accuracy rates are measured in several ways and the ways they are measured impact the accuracy rate.

