• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Gong Shu, Qu Youli, and Tian Shengfeng. Supervised Learning of an Automatic Noisy Semantic Unit Filter for Multi-Document Summarization[J]. Journal of Computer Research and Development, 2013, 50(4): 873-882.
Citation: Gong Shu, Qu Youli, and Tian Shengfeng. Supervised Learning of an Automatic Noisy Semantic Unit Filter for Multi-Document Summarization[J]. Journal of Computer Research and Development, 2013, 50(4): 873-882.

Supervised Learning of an Automatic Noisy Semantic Unit Filter for Multi-Document Summarization

More Information
  • Published Date: April 14, 2013
  • The target of multi-document summarization is a document set containing many noises. Most of the state-of-art summarization systems use fixed threshold-based noise filter with a manually selected threshold to filter out low frequency units. But according to the observation in experiments, the best threshold varies according to different document sets, summarization algorithms and text representations. These mean that a fixed threshold-based noise filter cannot achieve good robustness in different summarization settings which will lead to an unstable noise filtering efficiency. Therefore, a supervised learning method to generate automatic noise filter is proposed. Based on the labels extracted automatically from human written summaries and a set of selected features which can be used for different types of semantic units, a semantic unit classifier is trained to compose the automatic noise filter, which can be used for different types of semantic unit generated by different text representation methods, and can automatically filter out noisy semantic units at the preprocessing stage of multi-document summarization systems. Experiments show the robustness of the automatic noise filter generated by the supervised learning method under different document sets, summarization algorithms and text representations, and also show the improvements in the speed and summary quality of each summarization algorithms benefited from noise filtering.

Catalog

    Article views PDF downloads Cited by()
    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return