A Novel Video Caption Detection Approach Using Multi-Frame Integration

Wang Rongrong, Jin Wanjun, and Wu Lide

Journal of Computer Research and Development > 2005 > 42(7): 1191-1197.

Wang Rongrong, Jin Wanjun, and Wu Lide. A Novel Video Caption Detection Approach Using Multi-Frame Integration[J]. Journal of Computer Research and Development, 2005, 42(7): 1191-1197.

Citation:

Wang Rongrong, Jin Wanjun, and Wu Lide. A Novel Video Caption Detection Approach Using Multi-Frame Integration[J]. Journal of Computer Research and Development, 2005, 42(7): 1191-1197.

Citation:

Wang Rongrong, Jin Wanjun, and Wu Lide. A Novel Video Caption Detection Approach Using Multi-Frame Integration[J]. Journal of Computer Research and Development, 2005, 42(7): 1191-1197.

PDF (513 KB)

A Novel Video Caption Detection Approach Using Multi-Frame Integration

Wang Rongrong, Jin Wanjun, and Wu Lide

(Media Computing and Web Intelligence Laboratory, Department of Computer Science and Engineering, Fudan University, Shanghai 200433)

More Information

Published Date: July 14, 2005

Graphical Abstract

Abstract

Abstract

Captions in videos often play an important role in video information indexing and retrieval. In this paper, a novel video caption detection approach is presented. This approach first applies a new multiple frames integration (MFI) method to reduce the complexity of the background of the image. A time-based minimum (or maximum) pixel value search is employed and a Sobel edge map is used to determine the mode of search. Then block-based text detection is performed, i. e. a small window is used to scan the image and classified as text or non-text, using Sobel edges as features. A two-level pyramid is applied to detect various text sizes. Finally, the approach presents a new iterative text line decomposition method, and accurate text bounding boxes are extracted from the candidate text areas. Experimental results show that the proposed approach achieves a high precision and recall.
- caption detection,
- video,
- multi-frame integration,
- Sobel edge,
- iterative text region decomposition

FullText(HTML)

References (0)

[1]	Xue Zhihang, Xu Zheming, Lang Congyan, Feng Songhe, Wang Tao, Li Yidong. Text-to-Image Generation Method Based on Image-Text Semantic Consistency[J]. Journal of Computer Research and Development, 2023, 60(9): 2180-2190. DOI: 10.7544/issn1000-1239.202220416
[2]	Sun Rujun, Zhang Lufei, Hao Ziyu, Chen Zuoning. Consistency Based Iterating Models in Graph Computing[J]. Journal of Computer Research and Development, 2019, 56(2): 431-441. DOI: 10.7544/issn1000-1239.2019.20170902
[3]	Yao Xinghua, Deng Peimin, Yi Zhong, Jiang Yuncheng. A Decomposition of the Weakly Invertible Linear Finite Automata[J]. Journal of Computer Research and Development, 2009, 46(6): 1043-1051.
[4]	Sun Yong, Wu Bo, and Feng Yanpeng. A Policy-and Value- Iteration Algorithm for POMDP[J]. Journal of Computer Research and Development, 2008, 45(10): 1763-1768.
[5]	Zhu Zhenfeng, Ye Yangdong, Gang Li. Iterative sIB Algorithm Based on Mutation[J]. Journal of Computer Research and Development, 2007, 44(11): 1832-1838.
[6]	Jia Yanmin, Wu Jian, Husela. A Complex Scripts Processing Model Based on Predication Rules[J]. Journal of Computer Research and Development, 2007, 44(3).
[7]	Mi Congjie, Liu Yang, and Xue Xiangyang. Video Texts Tracking and Segmentation Based on Multiple Frames[J]. Journal of Computer Research and Development, 2006, 43(9): 1523-1529.
[8]	Wang Xingyuan and Shi Qijiang. An Image Authentication Algorithm Based on Feature of Original Image and Hyperchaotic Iteration[J]. Journal of Computer Research and Development, 2005, 42(11): 1896-1902.
[9]	Yang Pei, Gao Yang, Chen Zhaoqian. Believability based Iterated Belief Revision[J]. Journal of Computer Research and Development, 2005, 42(8): 1293-1298.
[10]	Zhang Weimin, Zhu Xiaoqian, and Zhao Jun. Implementation of Phase Domain Decomposition Parallel Algorithm of Three-Dimensional Variational Data Assimilation[J]. Journal of Computer Research and Development, 2005, 42(6): 1059-1064.