• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Shi Jing and Dai Guozhong. Text Segmentation Based on PLSA Model[J]. Journal of Computer Research and Development, 2007, 44(2): 242-248.
Citation: Shi Jing and Dai Guozhong. Text Segmentation Based on PLSA Model[J]. Journal of Computer Research and Development, 2007, 44(2): 242-248.

Text Segmentation Based on PLSA Model

More Information
  • Published Date: February 14, 2007
  • Text segmentation is very important for many fields including information retrieval, summarization, language modeling, anaphora resolution and so on. Text segmentation based on PLSA associates different latent topics with observable pairs of word and sentence. In the experiments, Chinese whole sentences are taken as elementary blocks. Variety of similarity metrics and several approaches to discovering boundaries are tried. The influences of repetition of unknown words in adjacent sentences on similarity values are considered. The best results show the error rate is 6.06%, which is far lower than that of other algorithms of text segmentation.
  • Related Articles

    [1]Jiang Tao, Li Zhanhuai. A Survey on Local Pattern Mining in Gene Expression Data[J]. Journal of Computer Research and Development, 2018, 55(11): 2343-2360. DOI: 10.7544/issn1000-1239.2018.20170629
    [2]Ding Zhaoyun, Jia Yan, Zhou Bin. Survey of Data Mining for Microblogs[J]. Journal of Computer Research and Development, 2014, 51(4): 691-706.
    [3]Liu Dayou, Chen Huiling, Qi Hong, and Yang Bo. Advances in Spatiotemporal Data Mining[J]. Journal of Computer Research and Development, 2013, 50(2): 225-239.
    [4]Lei Xiangxin, Yang Zhiying, Huang Shaoyin, Hu Yunfa. Mining Frequent Subtree on Paging XML Data Stream[J]. Journal of Computer Research and Development, 2012, 49(9): 1926-1936.
    [5]Liao Guoqiong, Wu Lingqin, Wan Changxuan. Frequent Patterns Mining over Uncertain Data Streams Based on Probability Decay Window Model[J]. Journal of Computer Research and Development, 2012, 49(5): 1105-1115.
    [6]Zhu Ranwei, Wang Peng, and Liu Majin. Algorithm Based on Counting for Mining Frequent Items over Data Stream[J]. Journal of Computer Research and Development, 2011, 48(10): 1803-1811.
    [7]Hu Wenyu, Sun Zhihui, Wu Yingjie. Study of Sampling Methods on Data Mining and Stream Mining[J]. Journal of Computer Research and Development, 2011, 48(1): 45-54.
    [8]Yang Bingru, Gao Jing, and Song Wei. Application Research of Cognitive Physics in Data Mining[J]. Journal of Computer Research and Development, 2006, 43(8): 1432-1438.
    [9]Liu Xuejun, Xu Hongbing, Dong Yisheng, Wang Yongli, Qian Jiangbo. Mining Frequent Patterns in Data Streams[J]. Journal of Computer Research and Development, 2005, 42(12): 2192-2198.
    [10]Yan Yuejin, Li Zhoujun, and Chen Huowang. A Depth-First Search Algorithm for Mining Maximal Frequent Itemsets[J]. Journal of Computer Research and Development, 2005, 42(3).

Catalog

    Article views (931) PDF downloads (937) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return