• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

基于选择性模式的贝叶斯分类算法

鞠卓亚, 王志海

鞠卓亚, 王志海. 基于选择性模式的贝叶斯分类算法[J]. 计算机研究与发展, 2020, 57(8): 1605-1616. DOI: 10.7544/issn1000-1239.2020.20200196
引用本文: 鞠卓亚, 王志海. 基于选择性模式的贝叶斯分类算法[J]. 计算机研究与发展, 2020, 57(8): 1605-1616. DOI: 10.7544/issn1000-1239.2020.20200196
Ju Zhuoya, Wang Zhihai. A Bayesian Classification Algorithm Based on Selective Patterns[J]. Journal of Computer Research and Development, 2020, 57(8): 1605-1616. DOI: 10.7544/issn1000-1239.2020.20200196
Citation: Ju Zhuoya, Wang Zhihai. A Bayesian Classification Algorithm Based on Selective Patterns[J]. Journal of Computer Research and Development, 2020, 57(8): 1605-1616. DOI: 10.7544/issn1000-1239.2020.20200196
鞠卓亚, 王志海. 基于选择性模式的贝叶斯分类算法[J]. 计算机研究与发展, 2020, 57(8): 1605-1616. CSTR: 32373.14.issn1000-1239.2020.20200196
引用本文: 鞠卓亚, 王志海. 基于选择性模式的贝叶斯分类算法[J]. 计算机研究与发展, 2020, 57(8): 1605-1616. CSTR: 32373.14.issn1000-1239.2020.20200196
Ju Zhuoya, Wang Zhihai. A Bayesian Classification Algorithm Based on Selective Patterns[J]. Journal of Computer Research and Development, 2020, 57(8): 1605-1616. CSTR: 32373.14.issn1000-1239.2020.20200196
Citation: Ju Zhuoya, Wang Zhihai. A Bayesian Classification Algorithm Based on Selective Patterns[J]. Journal of Computer Research and Development, 2020, 57(8): 1605-1616. CSTR: 32373.14.issn1000-1239.2020.20200196

基于选择性模式的贝叶斯分类算法

基金项目: 国家自然科学基金项目(61672086);北京市自然科学基金项目(4182052)
详细信息
  • 中图分类号: TP181

A Bayesian Classification Algorithm Based on Selective Patterns

Funds: This work was supported by the National Natural Science Foundation of China (61672086) and the Beijing Natural Science Foundation (4182052).
  • 摘要: 分类问题是数据挖掘的一个重要研究课题.朴素贝叶斯分类器是分类问题中一种简单高效的分类学习技术.该分类器假定给定类标时属性之间相互条件独立,然而现实中属性之间往往具有一定的依赖关系.“属性-值”序偶构成的模式在分类问题中具有关键作用,许多研究者利用这种特定模式构造分类器,而特定模式所包含的属性与其他属性之间的依赖关系,将对分类结果产生重要影响.通过对属性间的依赖关系进行深入研究,提出基于选择性模式的贝叶斯分类算法,既利用了基于贝叶斯网络分类器的优秀分类能力,又通过进一步分析模式中属性之间的依赖关系,削弱了属性条件独立假设的限制.实验证明:根据数据集特点,深入挖掘高区分能力的模式,合理构建属性之间的依赖关系,有助于提升分类精度.实验分析表明:与基准算法NB,AODE相比,提出的分类算法在10个数据集上的平均精度分别提升了1.65%和4.29%.
    Abstract: Data mining is mainly related to the theories and methods on how to discover knowledge from data in very large databases, while classification is an important topic in data mining. In the field of classification research, the Nave Bayesian classifier is a simple but effective learning technique, which has been widely used. It is commonly thought to assume that the probability of each attribute belonging to a given class value is independent of all other attributes. However, there are lots of contexts where the dependencies between attributes are more complex. It is an important technique to construct a classifier using specific patterns based on “attribute-value” pairs in lots of researchers’ work, while the dependencies among the attributes implied in the patterns and others will have significant impacts on classification results, thus the dependency between attributes is exploited adequately here. A Bayesian classification algorithm based on selective patterns is proposed, which could not only make use of the excellent classification ability based on Bayesian network classifiers, but also further weaken restrictions of the conditional independence assumption by further analyzing the dependencies between attributes in the patterns. The classification accuracies will benefit from fully considering the characteristics of datasets, mining and employing patterns which own high discrimination, and building the dependent relationship between attributes in a proper way. The empirical research results have shown that the average accuracy of the proposed classification algorithm on 10 datasets has been increased by 1.65% and 4.29%, comparing with the benchmark algorithms NB and AODE, respectively.
  • 期刊类型引用(12)

    1. 李军星,徐行,贾现召,邱明. 基于EEMD与CNN-BiLSTM的噪声环境下滚动轴承故障诊断方法. 轴承. 2025(02): 85-92 . 百度学术
    2. 欧桂良,何玉林,张曼静,黄哲学,Philippe Fournier-Viger. 风险最小化加权朴素贝叶斯分类器. 计算机科学. 2025(03): 137-151 . 百度学术
    3. 李傲,葛永新,刘慧君,杨春华,周修庄. 内容感知的可解释性路面病害检测模型. 计算机研究与发展. 2024(03): 701-715 . 本站查看
    4. 王月明,胡卓玮,陈锡. 基于社交媒体文本的灾情信息识别方法比较研究. 自然灾害学报. 2022(01): 179-187 . 百度学术
    5. 崔斌. 基于贝叶斯分类算法的网络入侵检测系统设计. 信息与电脑(理论版). 2022(13): 65-67 . 百度学术
    6. 蔡浩仁. 数据链路层安全研究. 自动化与仪器仪表. 2021(02): 85-88+92 . 百度学术
    7. 陈祖琴,蒋勋,葛继科. 基于网络舆情敏感信息的突发事件情景分析. 现代情报. 2021(05): 25-32 . 百度学术
    8. 谢丰,付文鹏,李阳,谢向群,李维刚. 基于贝叶斯决策模型的热轧卷筒电机故障诊断. 中国冶金. 2021(04): 68-73 . 百度学术
    9. 刘洋,王利民,孙铭会. 基于信息熵函数的启发式贝叶斯因果推理. 计算机学报. 2021(10): 2135-2147 . 百度学术
    10. 万钧. 基于纹理特征的地貌的统计贝叶斯划分方法研究. 遥测遥控. 2021(06): 113-120 . 百度学术
    11. 肖棋森,汤斌,李奉笑,肖渝,巫涛江,赵明富,程正富. 一种高冲突修正的区间证据水质数据融合研究. 重庆理工大学学报(自然科学). 2021(11): 247-252 . 百度学术
    12. 马亚州,张勇,侯益明,王紫薇. 基于朴素贝叶斯的新冠疫情新闻分类研究. 无线互联科技. 2020(14): 120-121 . 百度学术

    其他类型引用(13)

计量
  • 文章访问数:  973
  • HTML全文浏览量:  1
  • PDF下载量:  423
  • 被引次数: 25
出版历程
  • 发布日期:  2020-07-31

目录

    /

    返回文章
    返回