• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Xi Xuefeng, Chu Xiaomin, Sun Qingying, Zhou Guodong. Corpus Construction for Chinese Discourse Topic via Micro-Topic Scheme[J]. Journal of Computer Research and Development, 2017, 54(8): 1833-1852. DOI: 10.7544/issn1000-1239.2017.20170348
Citation: Xi Xuefeng, Chu Xiaomin, Sun Qingying, Zhou Guodong. Corpus Construction for Chinese Discourse Topic via Micro-Topic Scheme[J]. Journal of Computer Research and Development, 2017, 54(8): 1833-1852. DOI: 10.7544/issn1000-1239.2017.20170348

Corpus Construction for Chinese Discourse Topic via Micro-Topic Scheme

More Information
  • Published Date: July 31, 2017
  • Currently discourse topic structure analysis is the fundamental research of natural language understanding. Due to the lack of a large number of high-quality discourse corpus resources, which are suitable for Chinese discourse analysis, it has seriously restricted the research of the relevant discourse topic computing models. In order to solve the above problems, we firstly study the theoretical representation system of Chinese discourse topic structure. From the theme-rheme theory, theory of English rhetorical structure and Pennsylvania discourse treebank system, research of Chinese complex sentence and sentence group, combined with Chinese characteristics, we propose a Chinese discourse micro-topic scheme based on theme-rheme theory and construct a Chinese discourse topic structure representation model based on the topic chain. Then, on the basis of the above, we adopt the top-down and backward search annotation strategy and the combination of the human machine and the corpus annotation method to construct the Chinese discourse topic corpus (CDTC). Moreover, we carry out a detailed statistical analysis of the CDTC which contains a total of 500 documents. Compared with the OntoNotes corpus and the generalized topic structure theory, this micro-topic scheme representation model has some advantages in theory and is consistent with the Chinese characteristics. Finally, the consistency test shows that CDTC can fully reflect the difficulty of Chinese discourse topic analysis, and can provide support for the relevant research.
  • Related Articles

    [1]Wang Junlu, Zhang Guiyue, Du Likuan, Li Su, Chen Tingwei. A Multi-Level Index Construction Method for Master-Slave Blockchain[J]. Journal of Computer Research and Development, 2024, 61(3): 799-807. DOI: 10.7544/issn1000-1239.202220739
    [2]Liu Yutong, Wu Bin, Bai Ting. The Construction and Analysis of Classical Chinese Poetry Knowledge Graph[J]. Journal of Computer Research and Development, 2020, 57(6): 1252-1268. DOI: 10.7544/issn1000-1239.2020.20190641
    [3]Fan Xinggang, Xu Junchao, Che Zhicong, Ye Wenhao. A Probabilistic Barrier Coverage Model and Effective Construction Scheme[J]. Journal of Computer Research and Development, 2017, 54(5): 969-978. DOI: 10.7544/issn1000-1239.2017.20151182
    [4]Zhang Tao, Yu Jiong, Liao Bin, Guo Binglei, Bian Chen, Wang Yuefei, Liu Yan. The Construction and Analysis of Pass Network Graph Based on GraphX[J]. Journal of Computer Research and Development, 2016, 53(12): 2729-2752. DOI: 10.7544/issn1000-1239.2016.20160568
    [5]He Xianmang, Chen Yindong, Li Dong, Hao Yanni. A Construction for Social Network on the Basis of Project Cooperation[J]. Journal of Computer Research and Development, 2016, 53(4): 776-784. DOI: 10.7544/issn1000-1239.2016.20151172
    [6]LiuQiao, LiYang, DuanHong, LiuYao, QinZhiguang. Knowledge Graph Construction Techniques[J]. Journal of Computer Research and Development, 2016, 53(3): 582-600. DOI: 10.7544/issn1000-1239.2016.20148228
    [7]Gan Liang, Jia Yan, Li Aiping, Jin Xin. A Huge Dimension Table Join Algorithm for Construction of StreamCube[J]. Journal of Computer Research and Development, 2011, 48(1): 55-67.
    [8]Zong Dan, Li Chunpeng, Xia Shihong, Wang Zhaoqi. Key-Postures Based Automated Construction of Motion Graph[J]. Journal of Computer Research and Development, 2010, 47(8): 1321-1328.
    [9]Cui Shiqi, Liu Qun, Meng Yao, Yu Hao, Nishino Fumihito. New Word Detection Based on Large-Scale Corpus[J]. Journal of Computer Research and Development, 2006, 43(5): 927-932.
    [10]Zheng Qinghua, Wang Zhaojing, and Sun Xia. An Approach to Generate Semantic Network of Concept Based on Structural Corpus[J]. Journal of Computer Research and Development, 2005, 42(3).
  • Cited by

    Periodical cited type(9)

    1. 郭豆豆,徐伟华. R-FCCL:一种面向高维数据的稳健模糊概念认知学习方法. 计算机研究与发展. 2025(02): 383-396 . 本站查看
    2. 刘彧轩,廖宇晨,刘忠慧. 单条件三元概念构建及其融合推荐应用. 计算机与现代化. 2024(07): 1-6 .
    3. 李金海,王坤,陈强强. 三元概念的分布式并行构造算法. 模式识别与人工智能. 2024(10): 873-886 .
    4. 王霞,全园,李俊余,吴伟志. 三元概念的增量式构造方法. 南京大学学报(自然科学). 2022(01): 19-28 .
    5. 刘忠慧,赵琦,邹璐,闵帆. 三元概念的启发式构建及其在社会化推荐中的应用. 计算机科学. 2021(06): 234-240 .
    6. 李金海,贺建君,吴伟志. 多粒度形式概念分析的类属性块优化. 山东大学学报(理学版). 2020(05): 1-12 .
    7. 李俊余,李星璇,王霞,吴伟志. 基于三元因子分析的三元概念约简. 南京大学学报(自然科学). 2020(04): 480-493 .
    8. 李金海,魏玲,张卓,翟岩慧,张涛,智慧来,米允龙. 概念格理论与方法及其研究展望. 模式识别与人工智能. 2020(07): 619-642 .
    9. 王霞,谭斯文,李俊余,吴伟志. 基于条件属性蕴含的概念格构造及简化. 南京大学学报(自然科学). 2019(04): 553-563 .

    Other cited types(5)

Catalog

    Article views (1443) PDF downloads (506) Cited by(14)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return