• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Li Ronglu, Wang Jianhui, Chen Xiaoyun, Tao Xiaopeng, and Hu Yunfa. Using Maximum Entropy Model for Chinese Text Categorization[J]. Journal of Computer Research and Development, 2005, 42(1): 94-101.
Citation: Li Ronglu, Wang Jianhui, Chen Xiaoyun, Tao Xiaopeng, and Hu Yunfa. Using Maximum Entropy Model for Chinese Text Categorization[J]. Journal of Computer Research and Development, 2005, 42(1): 94-101.

Using Maximum Entropy Model for Chinese Text Categorization

More Information
  • Published Date: January 14, 2005
  • With the rapid development of World Wide Web, text classification has become the key technology in organizing and processing large amount of document data. Maximum entropy model is a probability estimation technique widely used for a variety of natural language tasks. It offers a clean and accommodable frame to combine diverse pieces of contextual information to estimate the probability of a certain linguistics phenomena. This approach for many tasks of NLP perform near state-of-the-art level, or outperform other competing probability methods when trained and tested under similar conditions. However, relatively little work has been done on applying maximum entropy model to text categorization problems. In addition, no previous work has focused on using maximum entropy model in classifying Chinese documents. Maximum entropy model is used for text categorization. Its categorization performance is compared and analyzed using different approaches for text feature generation, different number of feature and smoothng technique. Moreover, in experiments it is compared to Bayes, KNN and SVM, and it is shown that its performance is higher than Bayes and comparable with KNN and SVM. It is a promising technique for text categorization.
  • Related Articles

    [1]Zhao Shengnan, Jiang Han, Wei Xiaochao, Ke Junming, Zhao Minghao. An Efficient Single Server-Aided k-out-of-n Oblivious Transfer Protocol[J]. Journal of Computer Research and Development, 2017, 54(10): 2215-2223. DOI: 10.7544/issn1000-1239.2017.20170463
    [2]Zhang Hongbin, Ji Donghong, Yin Lan, Ren Yafeng, Niu Zhengyu. Caption Generation from Product Image Based on Tag Refinement and Syntactic Tree[J]. Journal of Computer Research and Development, 2016, 53(11): 2542-2555. DOI: 10.7544/issn1000-1239.2016.20150906
    [3]Chen Tieming, Yang Yimin, Chen Bo. Maldetect: An Android Malware Detection System Based on Abstraction of Dalvik Instructions[J]. Journal of Computer Research and Development, 2016, 53(10): 2299-2306. DOI: 10.7544/issn1000-1239.2016.20160348
    [4]Liu Duo, Dai Yiqi. Construction of Transformation Matrix with a Given Period Modulo N[J]. Journal of Computer Research and Development, 2012, 49(5): 925-931.
    [5]Hu Kai, Wang Zhe, Jiang Shu, and Yin Baolin. A Performance Model of k-Ary n-Cube Under Communication Locality[J]. Journal of Computer Research and Development, 2011, 48(11): 2083-2093.
    [6]Sun Decai, Sun Xingming, Zhang Wei, and Liu Yuling. A Filter Algorithm for Approximate String Matching Based on Match-Region Features[J]. Journal of Computer Research and Development, 2010, 47(4): 663-670.
    [7]Chen Huahong, Luo Xiaonan, Ling Ruotian, Ma Jianping. A Mesh Simplification Algorithm Based on n-Edges-Mesh Collapse[J]. Journal of Computer Research and Development, 2008, 45(6).
    [8]Zhang Yuejie, Xu Zhiting, and Xue Xiangyang. Fusion of Multiple Features for Chinese Named Entity Recognition Based on Maximum Entropy Model[J]. Journal of Computer Research and Development, 2008, 45(6).
    [9]Xia Luning and Jing Jiwu. An Administrative Model for Role-Based Access Control Using Hierarchical Namespace[J]. Journal of Computer Research and Development, 2007, 44(12): 2020-2027.
    [10]Tang Huanling, Sun Jiantao, Lu Yuchang. A Weight Adjustment Technique with Feature Weight Function Named TEF-WA in Text Categorization[J]. Journal of Computer Research and Development, 2005, 42(1): 47-53.

Catalog

    Article views (4160) PDF downloads (4344) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return