The scope of our Optical Character Recognition project in java on a grid infrastructure is to provide an efficient and enhanced software tool for the users to perform Document Image Analysis, document processing by reading and recognizing the characters in research, academic, governmental and business organizations that are having large pool of documented, scanned images. Irrespective of the size of documents and the type of characters in documents, the product is recognizing them, searching them and processing them faster according to the needs of the environment.
Drawback of Existing System
The drawback in the early OCR systems is that they only have the capability to convert and recognize only the documents of English or a specific language only. That is, the older OCR system is uni-lingual.
Benefit of Proposed System
The benefit of proposed system that overcomes the drawback of the existing system is that it supports multiple functionalities such as editing and searching. It also adds benefit by providing heterogeneous characters recognition
ARCHITECTURE OF THE PROPOSED SYSTEM
The Architecture of the optical character recognition system on a grid infrastructure consists of the three main components. They are:-
- OCR Hardware or Software
- Output Interface
Modules and their functionalities
Our software system Optical Character Recognition on a grid infrastructure can be divided into five modules based on its functionality.The modules classified are as follows:-
- Document Processing Module
- System Training Module.
- Document Recognition Module.
- Document Editing Module and
- Document Searching Module.
The Optical Character Recognition software can be enhanced in the future in different kinds of ways such as:
- Training and recognition speeds can be increased greater and greater by making it more user-friendly.
- Many applications exist where it would be desirable to read handwritten entries. Reading handwriting is a very difficult task considering the diversities that exist in ordinary penmanship. However, progress is being made.