Abstract:
Preprocess procedure is an important procedure in a document image analysis (DIA) system. In practical document images, characters usually overlap with the prep rinted form frames, creating tremendous problems for the recognition engines. Mo st of the form frame line removal algorithms are based on bi-level images, which have lost much useful information during the binary stage. Proposed in this pap er is a line removal algorithm directly based on gray-level images. First, cross -points of characters and lines are detected by Soble gradient. Then the overlap ping types of characters and lines are converted into touch type or crossover ty pe by cross-points analysis. Finally, lines are removed with topological method. Experiment results on 1225 real life character string images demonstrate the ef ficiency of this algorithm. The recognition rate is improved from 75.9% to 91.4% .