高级检索

    一种新的利用多帧结合检测视频标题文字的算法

    A Novel Video Caption Detection Approach Using Multi-Frame Integration

    • 摘要: 视频中的标题文字通常在视频信息索引和检索中起到重要作用.提出了一种新的视频标题文字的检测算法.首先采用一种新的多帧结合技术来降低图像背景的复杂度,它基于时间序列对多帧图像进行最小(或最大)像素值搜索,搜索的具体方式由Sobel边缘图来决定.然后以块为单位来进行文字与非文字的分类,即用一扫描窗口对图像进行扫描,以Sobel边缘为特征,判断其是否为文字.一个2级的金字塔被用来检测不同大小的文字.最后,提出一种新的迭代的文字区域分解方法,它能够更精确地定位文字区域的边界.实验结果表明,这种文字检测算法能够取得很高的精度和召回率.

       

      Abstract: Captions in videos often play an important role in video information indexing and retrieval. In this paper, a novel video caption detection approach is presented. This approach first applies a new multiple frames integration (MFI) method to reduce the complexity of the background of the image. A time-based minimum (or maximum) pixel value search is employed and a Sobel edge map is used to determine the mode of search. Then block-based text detection is performed, i. e. a small window is used to scan the image and classified as text or non-text, using Sobel edges as features. A two-level pyramid is applied to detect various text sizes. Finally, the approach presents a new iterative text line decomposition method, and accurate text bounding boxes are extracted from the candidate text areas. Experimental results show that the proposed approach achieves a high precision and recall.

       

    /

    返回文章
    返回