• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Zhang Wenjun, Jiang Liangxiao, Zhang Huan, Chen Long. A Two-Layer Bayes Model: Random Forest Naive Bayes[J]. Journal of Computer Research and Development, 2021, 58(9): 2040-2051. DOI: 10.7544/issn1000-1239.2021.20200521
Citation: Zhang Wenjun, Jiang Liangxiao, Zhang Huan, Chen Long. A Two-Layer Bayes Model: Random Forest Naive Bayes[J]. Journal of Computer Research and Development, 2021, 58(9): 2040-2051. DOI: 10.7544/issn1000-1239.2021.20200521

A Two-Layer Bayes Model: Random Forest Naive Bayes

Funds: The work was supported by the Joint Fund Key Projects of the National Natural Science Foundation of China (U1711267) and the Fundamental Research Funds for the Central Universities (CUGGC03).
More Information
  • Published Date: August 31, 2021
  • Text classification is an essential task in natural language processing. The high dimension and sparsity of text data bring many problems and challenges to text classification. Naive Bayes (NB) is widely used in text classification due to its simplicity, efficiency and comprehensibility, but its attribute conditional independence assumption is rarely met in real-world text data and thus affects its classification performance. In order to weaken the attribute conditional independence assumption required by NB, scholars have proposed a variety of improved approaches, mainly including structure extension, instance selection, instance weighting, feature selection, and feature weighting. However, all these approaches construct NB classification models based on the independent term features, which restricts their classification performance to a certain extent. In this paper, we try to improve the naive Bayes text classification model by feature learning and thus propose a two-layer Bayes model called random forest naive Bayes (RFNB). RFNB is divided into two layers. In the first layer, random forest (RF) is used to learn high-level features of term combinations from original term features. Then the learned new features are input into the second layer, which is used to construct a Bernoulli naive Bayes model after one-hot encoding. The experimental results on a large number of widely used text datasets show that the proposed RFNB significantly outperforms the existing state-of-the-art naive Bayes text classification models and other classical text classification models.
  • Related Articles

    [1]Zhou Peng, Zuo Zhiqiang. Design and Implementation of a Parallel Symbolic Execution Engine Based on Multi-Threading[J]. Journal of Computer Research and Development, 2023, 60(2): 248-261. DOI: 10.7544/issn1000-1239.202220920
    [2]Tian Zhenzhou, Wang Ningning, Wang Qing, Gao Cong, Liu Ting, Zheng Qinghua. Plagiarism Detection of Multi-Threaded Programs by Mining Behavioral motifs[J]. Journal of Computer Research and Development, 2020, 57(1): 202-213. DOI: 10.7544/issn1000-1239.2020.20180871
    [3]Wang Bohong, Liu Yi, Zhang Guozhen, Qian Depei. Debugging Multi-Core Parallel Programs by Gradually Refined Snapshot Sequences[J]. Journal of Computer Research and Development, 2017, 54(4): 821-831. DOI: 10.7544/issn1000-1239.2017.20151060
    [4]Gao Ke, Fan Dongrui, Liu Zhiyong. Decoupling Contention with VRB Mechanism for Multi-Threaded Applications[J]. Journal of Computer Research and Development, 2015, 52(11): 2577-2588. DOI: 10.7544/issn1000-1239.2015.20148178
    [5]Tang Yixuan, Wu Junmin, Chen Guoliang, Sui Xiufeng, Huang Jing. A Utility Based Cache Optimization Mechanism for Multi-Thread Workloads[J]. Journal of Computer Research and Development, 2013, 50(1): 170-180.
    [6]Shou Lidan, Hu Wei, Luo Xinyuan, Chen Ke, and Chen Gang. An Implementation of Attributive Predicate Lock in Database System[J]. Journal of Computer Research and Development, 2012, 49(10): 2260-2270.
    [7]Wen Shuguang, Xie Gaogang. libpcap-MT: A General Purpose Packet Capture Library with Multi-Thread[J]. Journal of Computer Research and Development, 2011, 48(5): 756-764.
    [8]Tian Hangpei, Gao Deyuan, Fan Xiaoya, and Zhu Yian. Memory Request Queue of Multi-Core Multi-Threading Processor for Real-Time Stream Processing[J]. Journal of Computer Research and Development, 2009, 46(10): 1634-1641.
    [9]Wu Ping, Chen Yiyun, Zhang Jian. Static Data-Race Detection for Multithread Programs[J]. Journal of Computer Research and Development, 2006, 43(2): 329-335.
    [10]Yao Nianmin, Shu Jiwu, and Zheng Weimin. The Distributed Lock Scheme in SAN[J]. Journal of Computer Research and Development, 2005, 42(2): 338-343.
  • Cited by

    Periodical cited type(5)

    1. 彭牧尧,魏建军,王乾舟,王琨. 基于最大最小蚂蚁系统的容迟网络缓存机制. 无线电通信技术. 2023(06): 1095-1103 .
    2. 刘涛. 基于机会网络节点定位算法的优化设计. 白城师范学院学报. 2021(02): 38-42 .
    3. 刘慧,钱育蓉,张振宇,杨文忠. 机会网络中基于陌生节点的竞争转发策略. 计算机工程与设计. 2021(10): 2710-2717 .
    4. 龙浩,张书奎,张力. 一种车载机会网络文件调度与数据传输算法. 计算机应用与软件. 2020(04): 82-88 .
    5. 葛宇,梁静. 基于相遇概率时效性和重复扩散感知的机会网络消息转发算法. 计算机应用. 2020(05): 1397-1402 .

    Other cited types(3)

Catalog

    Article views (701) PDF downloads (227) Cited by(8)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return