Text Segmentation for MRC Document Compression ECE Matlab Project

For the document compression a particular framework known as the mixed raster content or MRC in short. This can improve the quality trade off in comparison to the lossy algorithm that are traditionally used for compression. The separation of the document into the background and foreground is made possible only with the help of the MRC compression. This represents the binary mask. The ratio of compression and the resulting quality of the encoder document based on MMRC is dependent upon the algorithm that is used for the computation of the binary mask.

In order to enhance the accuracy of the text,  multi scale framework is used with various sizes. Computers, printers, copiers, scanners are some of the networked equipment’s that are used widely and with time it has become more necessary to efficiently store, transfer and compress large files and documents.

As per the previous works the texts, images etc. were not compressed or stored as it lead to loss of data. It can even scan a colored document at 300 dpi which approximately need 24 Mbytes of storage capacity without any need for compression. The tools that are used frequently for compression of natural images are JPEG2000 and JPEG. For raster documents that are formed of scanned compound are not really every effective and typically it consist of graphics texts and images. This is only due to the DCT that is fixed or transformation of wavelet for all typical contents thus resulting in ringing distortion near line –art and edges.

Some of the disadvantages are :

The text detection accuracy is quite low

The compressed data will be lost along with the storage details while transmitting the information’s.

Document files of large sizes cannot be transferred

The Otsu’s method as the most simple and traditional approach in our proposed system. Its divide the histograms into two different parts.

One Reply to “Text Segmentation for MRC Document Compression ECE Matlab Project”

Leave a Reply

Your email address will not be published. Required fields are marked *