高级检索

    基于灰度图像的表格框线去除算法

    A Form Frame Line Removal Algorithm Based on Gray-Level Image

    • 摘要: 笔画与表格框线的交叠的现象在表格型文档中普遍存在,严重影响了文档自动处理系统的性能.现有的去线算法大部分都是基于二值图像的,许多有用的局部信息已经丢失.提出了直接 利用图像灰度信息的灰值线检测与去除算法.首先利用图像的边缘特征检测直线以及字线的 相交位置;然后通过对直线上相交点对的分析确定字线的交叠方式,并将这些方式归纳为穿 透和未穿透两类简单的形式;最后将直线划分为保护区和擦除区两部分,保护区内的像素在 去线过程中被保留,而擦除区内的像素则利用灰度形态学算法来擦除.在我国现行支票上的 实验表明算法是有效的.

       

      Abstract: Preprocess procedure is an important procedure in a document image analysis (DIA) system. In practical document images, characters usually overlap with the prep rinted form frames, creating tremendous problems for the recognition engines. Mo st of the form frame line removal algorithms are based on bi-level images, which have lost much useful information during the binary stage. Proposed in this pap er is a line removal algorithm directly based on gray-level images. First, cross -points of characters and lines are detected by Soble gradient. Then the overlap ping types of characters and lines are converted into touch type or crossover ty pe by cross-points analysis. Finally, lines are removed with topological method. Experiment results on 1225 real life character string images demonstrate the ef ficiency of this algorithm. The recognition rate is improved from 75.9% to 91.4% .

       

    /

    返回文章
    返回