Optical Character Recognition Project

Optical Character Recognition Project Introduction:

In the running world, there is growing demand for the software systems to recognize characters in a computer system when information is scanned through paper documents as we know that we have a number of newspapers and books which are in a printed format related to different subjects.

These days there is a huge demand in “storing the information available in these paper documents into a computer storage disk and then later reusing this information by searching process”. One simple way to store information in these paper documents into a computer system is to first scan the documents and then store them as IMAGES.

But to reuse this information it is very difficult to read the individual contents and searching the contents of these documents line-by-line and word-by-word. The reason for this difficulty is the font characteristics of the characters in paper documents are different to the font of the characters in a computer system. As a result, the computer is unable to recognize the characters while reading them.

This concept of storing the contents of paper documents in computer storage place and then reading and searching the content is called DOCUMENT PROCESSING. Sometimes in this document processing, we need to process the information that is related to languages other than the English in the world. For this document processing, we need a software system called CHARACTER RECOGNITION SYSTEM. This process is also called DOCUMENT IMAGE ANALYSIS (DIA).

To effectively use Optical Character Recognition for character recognition in-order to perform Document Image Analysis (DIA), we are using the information in a Grid format. This system is thus effective and useful in Virtual Digital Library’s design and construction.

