Abstract:
In practical form bill images, characters usually overlap with the form frames, which will greatly affect the performance of the document image autoprocessing system. Most of the form frame line removal algorithms are based on binary images, which can not make good use of line characteristics in gray images. According to the attribute of financial documents’ structure, an improved line detection and removal algorithm applied in financial form image preprocessing is proposed in this paper. In order to reduce the complexity and improve the effect of line removal, the process of line detection and removal are carried out respectively. First, frame lines are exactly detected according to the line characteristics in gray images. Then chain code method is used to describe the frame line region. Crosspoints of characters and lines are detected subsequently with deterministic finite automaton in order to analyse the overlapping types. Finally, frame lines are removed with the marks in crosspoints detection. Therefore, the limitation of stroke aberrance caused by thresholding is overcome and higher accuracy of line removal can be achieved. The results of experiment demonstrate that compared with different existing methods based on handwritten digit character recognition, the proposed algorithm is efficient and robust.