• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Zhang Wenjun, Jiang Liangxiao, Zhang Huan, Chen Long. A Two-Layer Bayes Model: Random Forest Naive Bayes[J]. Journal of Computer Research and Development, 2021, 58(9): 2040-2051. DOI: 10.7544/issn1000-1239.2021.20200521
Citation: Zhang Wenjun, Jiang Liangxiao, Zhang Huan, Chen Long. A Two-Layer Bayes Model: Random Forest Naive Bayes[J]. Journal of Computer Research and Development, 2021, 58(9): 2040-2051. DOI: 10.7544/issn1000-1239.2021.20200521

A Two-Layer Bayes Model: Random Forest Naive Bayes

Funds: The work was supported by the Joint Fund Key Projects of the National Natural Science Foundation of China (U1711267) and the Fundamental Research Funds for the Central Universities (CUGGC03).
More Information
  • Published Date: August 31, 2021
  • Text classification is an essential task in natural language processing. The high dimension and sparsity of text data bring many problems and challenges to text classification. Naive Bayes (NB) is widely used in text classification due to its simplicity, efficiency and comprehensibility, but its attribute conditional independence assumption is rarely met in real-world text data and thus affects its classification performance. In order to weaken the attribute conditional independence assumption required by NB, scholars have proposed a variety of improved approaches, mainly including structure extension, instance selection, instance weighting, feature selection, and feature weighting. However, all these approaches construct NB classification models based on the independent term features, which restricts their classification performance to a certain extent. In this paper, we try to improve the naive Bayes text classification model by feature learning and thus propose a two-layer Bayes model called random forest naive Bayes (RFNB). RFNB is divided into two layers. In the first layer, random forest (RF) is used to learn high-level features of term combinations from original term features. Then the learned new features are input into the second layer, which is used to construct a Bernoulli naive Bayes model after one-hot encoding. The experimental results on a large number of widely used text datasets show that the proposed RFNB significantly outperforms the existing state-of-the-art naive Bayes text classification models and other classical text classification models.
  • Related Articles

    [1]Sun Chang’ai, Wang Zhen, Pan Lin. Optimized Mutation Testing Techniques for WS-BPEL Programs[J]. Journal of Computer Research and Development, 2019, 56(4): 895-905. DOI: 10.7544/issn1000-1239.2019.20180037
    [2]Guo Xi, Wang Pan. Variable Dependent Relation Analysis in Program State Condition Merging[J]. Journal of Computer Research and Development, 2018, 55(10): 2331-2342. DOI: 10.7544/issn1000-1239.2018.20170545
    [3]Wu Lei, Zhang Wensheng, Wang Jue. Hidden Topic Variable Graphical Model Based on Deep Learning Framework[J]. Journal of Computer Research and Development, 2015, 52(1): 191-199. DOI: 10.7544/issn1000-1239.2015.20131113
    [4]Zhang Zhuhong, Tao Juan. Micro-Immune Optimization Approach Solving Nonlinear Interval Number Programming[J]. Journal of Computer Research and Development, 2014, 51(12): 2633-2643. DOI: 10.7544/issn1000-1239.2014.20131091
    [5]Sun Zhizhuo, Zhang Quanxin, Li Yuanzhang, Tan Yu'an, Liu Jingyu, Ma Zhongmei. Write Optimization for RAID5 in Sequential Data Storage[J]. Journal of Computer Research and Development, 2013, 50(8): 1604-1612.
    [6]Fan Tiehu, Qin Guihe, Zhao Qi. Uniform Design and Reconstructive BLX-α Based Scatter Search for Continuous Optimization Problem[J]. Journal of Computer Research and Development, 2011, 48(6): 1049-1058.
    [7]Ma Hongtu, Hu Shi'an, Su Yanbing, Li Xun, Zhao Rongcai. A Multi-Variable -Function Placement Algorithm Based on Dominator Frontier Inverse[J]. Journal of Computer Research and Development, 2011, 48(2): 346-352.
    [8]Wang Bin. A Discrete Particle Swarm Optimization-based Algorithm for Polygonal Approximation of Digital Curves[J]. Journal of Computer Research and Development, 2010, 47(11): 1886-1892.
    [9]Ye Xiaoping. Model and Algebra of Object-Relation Bitemporal Data Based on Temporal Variables[J]. Journal of Computer Research and Development, 2007, 44(11): 1971-1979.
    [10]Dong Hongbin, Huang Houkuan, He Jun, Hou Wei. An Evolutionary Programming to Solve Constrained Optimization Problems[J]. Journal of Computer Research and Development, 2006, 43(5): 841-850.
  • Cited by

    Periodical cited type(6)

    1. 桂易琪,王鹏程,王威,李鹏海,张乐君. 基于联邦学习与DQN的缓存策略. 扬州大学学报(自然科学版). 2025(02): 45-53 .
    2. 彭牧尧,魏建军,王乾舟,王琨. 基于最大最小蚂蚁系统的容迟网络缓存机制. 无线电通信技术. 2023(06): 1095-1103 .
    3. 刘涛. 基于机会网络节点定位算法的优化设计. 白城师范学院学报. 2021(02): 38-42 .
    4. 刘慧,钱育蓉,张振宇,杨文忠. 机会网络中基于陌生节点的竞争转发策略. 计算机工程与设计. 2021(10): 2710-2717 .
    5. 龙浩,张书奎,张力. 一种车载机会网络文件调度与数据传输算法. 计算机应用与软件. 2020(04): 82-88 .
    6. 葛宇,梁静. 基于相遇概率时效性和重复扩散感知的机会网络消息转发算法. 计算机应用. 2020(05): 1397-1402 .

    Other cited types(3)

Catalog

    Article views (705) PDF downloads (227) Cited by(9)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return