• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

共享和私有信息最大化的跨媒体聚类

闫小强, 叶阳东

闫小强, 叶阳东. 共享和私有信息最大化的跨媒体聚类[J]. 计算机研究与发展, 2019, 56(7): 1370-1382. DOI: 10.7544/issn1000-1239.2019.20180470
引用本文: 闫小强, 叶阳东. 共享和私有信息最大化的跨媒体聚类[J]. 计算机研究与发展, 2019, 56(7): 1370-1382. DOI: 10.7544/issn1000-1239.2019.20180470
Yan Xiaoqiang, Ye Yangdong. Cross-Media Clustering by Share and Private Information Maximization[J]. Journal of Computer Research and Development, 2019, 56(7): 1370-1382. DOI: 10.7544/issn1000-1239.2019.20180470
Citation: Yan Xiaoqiang, Ye Yangdong. Cross-Media Clustering by Share and Private Information Maximization[J]. Journal of Computer Research and Development, 2019, 56(7): 1370-1382. DOI: 10.7544/issn1000-1239.2019.20180470
闫小强, 叶阳东. 共享和私有信息最大化的跨媒体聚类[J]. 计算机研究与发展, 2019, 56(7): 1370-1382. CSTR: 32373.14.issn1000-1239.2019.20180470
引用本文: 闫小强, 叶阳东. 共享和私有信息最大化的跨媒体聚类[J]. 计算机研究与发展, 2019, 56(7): 1370-1382. CSTR: 32373.14.issn1000-1239.2019.20180470
Yan Xiaoqiang, Ye Yangdong. Cross-Media Clustering by Share and Private Information Maximization[J]. Journal of Computer Research and Development, 2019, 56(7): 1370-1382. CSTR: 32373.14.issn1000-1239.2019.20180470
Citation: Yan Xiaoqiang, Ye Yangdong. Cross-Media Clustering by Share and Private Information Maximization[J]. Journal of Computer Research and Development, 2019, 56(7): 1370-1382. CSTR: 32373.14.issn1000-1239.2019.20180470

共享和私有信息最大化的跨媒体聚类

基金项目: 国家重点研发计划项目(2018YFB1201403);国家自然科学基金项目(61772475,61502434)
详细信息
  • 中图分类号: TP181

Cross-Media Clustering by Share and Private Information Maximization

  • 摘要: 近年来,具有典型多源异构特性的跨媒体数据的快速涌现给数据分析带来巨大挑战.然而,绝大多数现有跨媒体数据分析方法仅依赖模态间的共享信息发掘跨媒体数据中蕴含的模式结构,忽略各模态自身的重要信息.针对此问题,提出共享和私有信息最大化(share and private information maximization)的跨媒体聚类算法,通过兼顾跨媒体数据的共享和私有信息,以求得更加合理的聚类模式.首先,提出2种跨媒体数据的共享信息构建模型:1)混合单词模型,该模型将各模态的底层特征转换为统一的词频向量表示,然后使用一种新的自凝聚信息最大化方法自底向上地构建多模态的混合单词空间,最大化地保持各模态底层特征的统计相似性;2)聚类集成模型,构建各模态自身的聚类划分,通过互信息度量各模态聚类划分间的信息量,抽取各模态的高层聚类划分之间的相关性.其次,提出基于信息论的目标函数,将跨媒体数据的共享和私有信息融合在同一目标函数中,在抽取聚类模式结构的过程中兼顾跨媒体数据的共享和私有信息.最后,采用顺序“抽取-合并”过程优化SPIM算法的目标函数,保证其收敛到局部最优解.在6种跨媒体数据上的实验结果表明SPIM算法的优越性.
    Abstract: Recently, the rapid emergence of cross media data with typical multi-source and heterogeneous characteristic brings great challenges to the traditional data analysis approaches. However, the most of existing approaches for cross media data heavily rely on the shared latent feature space to construct the relationships between multiple modalities, while ignoring the private information hidden in each modality. Aiming at this problem, this paper proposes a novel share and private information maximization (SPIM) algorithm for cross media data clustering, which leverages the shared and private information into the clustering process. Firstly, we present two shared information construction models: 1) Hybrid words (H-words) model. In this model, the low-level features in each modality are transformed into words or visual words co-occurrence vector, then a novel agglomerative information maximization is presented to build the hybrid word space for all modalities, which ensures the statistical correlation between the low-level features of multiple modalities. 2) Clustering ensemble (CE) model. This model adopts the mutual information to measure the similarity between the clustering partitions of different modalities, which ensures the semantic correlation of the high-level clustering partitions. Secondly, SPIM algorithm integrates the shared information of multiple modalities and the private information of individual modalities into a unified objective function. Finally, the optimization of SPIM algorithm is performed by a sequential “draw-and-merge” procedure, which guarantees the function converge to a local maximum. The experimental results on 6 cross media datasets show that the proposed approach compares favorably with the existing state-of-the-art cross-media clustering methods.
  • 期刊类型引用(5)

    1. 谢朝武,黄锐. 目的地旅游安全事件集群:概念框架与测度体系研究. 旅游学刊. 2023(05): 42-57 . 百度学术
    2. 严定宇,张宇鹏,陆希玉,曹华平. 对网络空间安全建模的系统思考. 网络安全与数据治理. 2023(12): 34-40 . 百度学术
    3. 刘小虎,张恒巍,马军强,张玉臣,谭晶磊. 基于攻防博弈的网络防御决策方法研究综述. 网络与信息安全学报. 2022(01): 1-14 . 百度学术
    4. 杨轶杰,朱广劼,司群,杨文. 铁路网络空间可视化实现路径分析. 铁路计算机应用. 2021(11): 15-20 . 百度学术
    5. 刘小虎,张恒巍,张玉臣,胡浩,程建. 基于博弈论的网络攻防行为建模与态势演化分析. 电子与信息学报. 2021(12): 3629-3638 . 百度学术

    其他类型引用(3)

计量
  • 文章访问数:  864
  • HTML全文浏览量:  5
  • PDF下载量:  335
  • 被引次数: 8
出版历程
  • 发布日期:  2019-06-30

目录

    /

    返回文章
    返回