• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Li Haobo, Li Mohan, Chen Peng, Sun Yanbin, Tian Zhihong. A Corruption-resistant Data Identification Technology Based on Dataset Honeypoint[J]. Journal of Computer Research and Development, 2024, 61(10): 2417-2432. DOI: 10.7544/issn1000-1239.202440496
Citation: Li Haobo, Li Mohan, Chen Peng, Sun Yanbin, Tian Zhihong. A Corruption-resistant Data Identification Technology Based on Dataset Honeypoint[J]. Journal of Computer Research and Development, 2024, 61(10): 2417-2432. DOI: 10.7544/issn1000-1239.202440496

A Corruption-resistant Data Identification Technology Based on Dataset Honeypoint

Funds: This work was supported by the National Key Research and Development Program of China (2021YFB3101704), the National Natural Science Foundation of China (62372126, 62272119, 62072130, U20B2046), the Guangdong Basic and Applied Basic Research Foundation (2023A1515030142), the Guangzhou Basic and Applied Basic Research Foundation (2024A04J9969), the Guangzhou University Project (YJ2023047), Guangdong Province Universities and Colleges Pearl River Scholar Funded Scheme (2019), the Guangdong Higher Education Innovation Group Project (2020KCXTD007), and the Strategic Research and Consulting Project of the Chinese Academy of Engineering (2023-JB-13).
More Information
  • Author Bio:

    Li Haobo: born in 2000. Master candidate. Student member of CCF. His main research interests include data security and deception defense

    Li Mohan: born in 1987. PhD, professor. Senior member of CCF. Her main research interests include AI security, data governance, and intrusion detection

    Chen Peng: born in 1988. PhD, postdoc. Member of CCF. His main research interests include explainable AI, shape manifold analysis, and data governance

    Sun Yanbin: born in 1987. PhD, professor. Senior member of CCF. His main research interests include network security, industrial control security, future network, and data governance

    Tian Zhihong: born in 1978. PhD, professor, PhD supervisor. Distinguished member of CCF. His main research interests include network attack and defense confrontation, APT detection and traceability, industrial control security, and data governance

  • Received Date: June 04, 2024
  • Revised Date: July 15, 2024
  • Available Online: September 13, 2024
  • Data identification is a prerequisite for achieving precise data governance, effectively ensuring the security of data elements during cross-domain transfer. Currently, there are methods for generating identifiers for individual data, but as the scale of data continues to expand, identifiers at the data level cannot be directly applied to the dataset level. This also introduces issues of identifiers being “easily damaged” and “difficult to embed”. To effectively address these issues, we adopt the design concept of network honeypoint from the “guardian” model proposed by academician Fang Binxing. Utilizing the idea of deception defense, we propose an anti-damage data identification technology based on dataset honeypoint for cross-domain data transfer scenarios, and design a complete method for generating and embedding dataset honeypoints. First, for cross-domain data transfer scenarios, dataset honeypoints are designed. By enhancing the concealment of dataset honeypoints and increasing their redundancy, the issue of identifiers being “easily damaged” is addressed. Second, by ensuring that the form of dataset honeypoint is indistinguishable from real data, the issue of identifiers being “difficult to embed” is resolved. Finally, experiments conducted on both image and encrypted text data modalities demonstrate that dataset honeypoints possess high anti-damage capability, high robustness, and low performance overhead.

  • [1]
    李涛,杨安家,翁健,等. 基于智能合约的工业互联网数据公开审计方案[J]. 软件学报,2023,34(3):1491−1511

    Li Tao, Yang Anjia, Weng Jian, et al. Industrial Internet data disclosure audit scheme based on smart contracts[J]. Journal of Software, 2023, 34(3): 1491−1511 (in Chinese)
    [2]
    Qin Zhiguang, Xiong Hu, Wu Shikun, et al. A survey of proxy re-encryption for secure data sharing in cloud computing[J]. IEEE Transactions on Services Computing, 2016: 1−1
    [3]
    陈骁,黄牧鸿,田一凡,等. 基于分片区块链的车联网数据共享方案[J]. 计算机研究与发展,2024,61(9):2246−2260

    Chen Xiao, Huang Muhong, Tian Yifan, et al. Internet of vehicles data sharing scheme via blockchain sharding[J]. Journal of Computer Research and Development, 2024, 61(9): 2246−2260 (in Chinese)
    [4]
    Feng Chaosheng, Liu Bin, Guo Zhen, et al. Blockchain-based cross-domain authentication for intelligent 5G-enabled Internet of drones[J]. IEEE Internet of Things Journal, 2022, 9(8): 6224−6238 doi: 10.1109/JIOT.2021.3113321
    [5]
    田志宏,方滨兴,廖清,等. 从自卫到护卫:新时期网络安全保障体系构建与发展建议[J]. 中国工程科学,2023,25(6):96−105 doi: 10.15302/J-SSCAE-2023.06.007

    Tian Zhihong, Fang Binxing, Liao Qing, et al. From self-defense to protection: Suggestions for the construction and development of network security assurance system in the new era[J]. Chinese Journal of Engineering Science, 2023, 25(6): 96−105 (in Chinese) doi: 10.15302/J-SSCAE-2023.06.007
    [6]
    王瑞,阳长江,邓向东,等. 欺骗防御技术发展及其大语言模型应用探索[J]. 计算机研究与发展,2024,61(5):1230−1249

    Wang Rui, Yang Changjiang, Deng Xiangdong, et al. Development of deception defense technology and exploration of its application in large language models[J]. Journal of Computer Research and Development, 2024, 61(5): 1230−1249 (in Chinese)
    [7]
    Wang Y R, Madnick S E. A polygen model for heterogeneous database systems: The source tagging perspective[C]//Proc of 16th Int Conf on Very Large Data Bases. Australia: Proceedings, 1990: 519−538
    [8]
    Ram S, Liu Jun. Understanding the Semantics of Data Provenance to Support Active Conceptual Modeling[M]. Berlin: Springer, 2006: 17–29
    [9]
    Ram S, Liu Jun. A new perspective on semantics of data provenance[C]//Proc of Int Conf on Semantic Web in Provenance Management. Aachen: CEUR-WS. org, 2009: 35−40
    [10]
    刘峰,张晓林. 科学数据元数据标准述评及其通用化设计研究[J]. 现代图书情报技术,2015(12):3−12

    Liu Feng, Zhang Xiaolin. Review of scientific data metadata standards and research on their universal design[J]. Modern Library and Information Technology, 2015(12): 3−12 (in Chinese)
    [11]
    王逢阳,徐全军,刘峰,等. 科学数据溯源描述模型及规范设计与思考[J]. 科研信息化技术与应用,2017,8(1):27−34

    Wang Fengyang, Xu Quanjun, Liu Feng, et al. Scientific data traceability description model and specification design and thinking[J]. Scientific Research Information Technology and Application, 2017, 8(1): 27−34 (in Chinese)
    [12]
    张秋霞,田润. 物联网标识体系与Ecode编码标准比较研究[J]. 中国自动识别技术,2021(1):57−66

    Zhang Qiuxia, Tian Run. Comparative study on the Internet of things identification system and ecode encoding standard[J]. China Automatic Identification Technology, 2021(1): 57−66 (in Chinese)
    [13]
    任语铮,谢人超,曾诗钦,等. 工业互联网标识解析体系综述[J]. 通信学报,2019,40(11):138−155

    Ren Yuzheng, Xie Renchao, Zeng Shiqin, et al. Overview of industrial Internet identifier resolution system[J]. Journal of Communications, 2019, 40(11): 138−155 (in Chinese)
    [14]
    吴东亚. 对象标识符(OID)技术和应用分析[J]. 信息技术与标准化,2010(8):66−68

    Wu Dongya. Analysis of object identifier (OID) technology and application[J]. Information Technology and Standardization, 2010(8): 66−68 (in Chinese)
    [15]
    邓伟华,刘桔. 工业互联网标识与GS1编码体系[J]. 中国自动识别技术,2023(2):40−43

    Deng Weihua, Liu Ju. Industrial Internet identification and GS1 coding system[J]. China Automatic Identification Technology, 2023(2): 40−43 (in Chinese)
    [16]
    王保云. 物联网技术研究综述[J]. 电子测量与仪器学报,2009,23(12):1−7

    Wang Baoyun. A review of Internet of things technology research[J]. Journal of Electronic Measurement and Instrumentation, 2009, 23(12): 1−7 (in Chinese)
    [17]
    Luo Xiyang, Zhan Ruohan, Chang Huiwen, et al. Distortion agnostic deep watermarking[C]//Proc of 2020 IEEE/CVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2020: 13548−13557
    [18]
    Yoo K, Ahn W, Jang J, et al. Robust multi-bit natural language watermarking through invariant features[C]//Proc of the 61st Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2023: 2092−2115
    [19]
    Liu Aiwei, Pan Leyi, Hu Xuming, et al. A semantic invariant robust watermark for large language models[J]. arXiv preprint, arXiv: 2310.06356, 2023
    [20]
    贾召鹏,方滨兴,刘潮歌,等. 网络欺骗技术综述[J]. 通信学报,2017,38(12):128−143

    Jia Zhaopeng, Fang Binxing, Liu Chaoge, et al. A review of network deception technology[J]. Journal of Communications, 2017, 38(12): 128−143 (in Chinese)
    [21]
    Qin Xingsheng, Jiang F, Cen Mingcan, et al. Hybrid cyber defense strategies using Honey-X: A survey[J]. Computer Networks, 2023(230): 109776
    [22]
    Yuill J, Zappe M, Denning D, et al. Honeyfiles: Deceptive files for intrusion detection[C]//Proc of the Fifth Annual IEEE SMC Information Assurance Workshop. Piscataway, NJ: IEEE, 2004: 116−122
    [23]
    Zhang Hengru, Gong Jie. Research and design of network attack and defense platform based on virtual honeynet[C]//Proc of 2010 Int Conf on Computational and Information Sciences. Piscataway, NJ: IEEE, 2010: 507−510
    [24]
    Bartos K, Sofka M, Franc V. Optimized invariant representation of network traffic for detecting unseen malware variants[C]//Proc of 25th USENIX Security Symp. Berkeley, CA: USENIX Association, 2016: 807–822
    [25]
    Fu Chuanpu, Li Qi, Shen Meng, et al. Realtime robust malicious traffic detection via frequency domain analysis[C]//Proc of the 2021 ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2021: 3431−3446
    [26]
    吴文玲,冯登国. 分组密码工作模式的研究现状[J]. 计算机学报,2006,29(1):21−36

    Wu Wenling, Feng Dengguo. Research status of block cipher working modes[J]. Chinese Journal of Computers, 2006, 29(1): 21−36 (in Chinese)
    [27]
    Microsoft. A theorem prover from microsoft research[CP/OL]. 2022[2024-06-05]. https://github.com/Z3Prover/z3
    [28]
    Ding Yi, Wu Guozheng, Chen Dajiang, et al. DeepEDN: A deep-learning-based image encryption and decryption network for Internet of medical things[J]. IEEE Internet of Things Journal, 2020, 8(3): 1504−1518
    [29]
    Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images[R]. Toronto: University of Toronto, 2009
    [30]
    Gerard D, Arash H, Mamun M, et al. Characterization of encrypted and VPN traffic using time-related[C]//Proc of the 2nd Int Conf on Information Systems Security and Privacy. Portugal: SciTePress, 2016: 407–414
    [31]
    He Kaiming, Zhang Xiangyu, Ren Shaoqing, et al. Deep residual learning for image recognition[C]//Proc of 2016 IEEE Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2016: 770−778
    [32]
    Van der M L, Hinton G. Visualizing data using t-SNE[J]. Journal of machine learning research, 2008, 9(11): 2579−2605
    [33]
    Wu Dachun, Tsai W. A steganographic method for images by pixel-value differencing[J]. Pattern Recognition Letters, 2003, 24(9-10): 1613−1626 doi: 10.1016/S0167-8655(02)00402-6
  • Related Articles

    [1]Lin Liansheng, Zheng Huanqin, Su Shen, Lei Kai, Chen Xiaofeng, Tian Zhihong. An On-Chain Mechanism Against DeFi Price Manipulation Attacks[J]. Journal of Computer Research and Development, 2025, 62(2): 443-457. DOI: 10.7544/issn1000-1239.202330291
    [2]Song Shuwei, Ni Xiaoze, Chen Ting. Gas Optimization for Smart Contracts: A Survey[J]. Journal of Computer Research and Development, 2023, 60(2): 311-325. DOI: 10.7544/issn1000-1239.202220887
    [3]Ying Chenhao, Xia Fuyuan, Li Jie, Si Xueming, Luo Yuan. Incentive Mechanism Based on Truth Estimation of Private Data for Blockchain-Based Mobile Crowdsensing[J]. Journal of Computer Research and Development, 2022, 59(10): 2212-2232. DOI: 10.7544/issn1000-1239.20220493
    [4]Feng Jingyu, Yang Jinwen, Zhang Ruitong, Zhang Wenbo. A Spectrum Sharing Incentive Scheme Against Location Privacy Leakage in IoT Networks[J]. Journal of Computer Research and Development, 2020, 57(10): 2209-2220. DOI: 10.7544/issn1000-1239.2020.20200453
    [5]Hai Mo, Zhu Jianming. A Propagation Mechanism Combining an Optimal Propagation Path and Incentive in Blockchain Networks[J]. Journal of Computer Research and Development, 2019, 56(6): 1205-1218. DOI: 10.7544/issn1000-1239.2019.20180419
    [6]He Yunhua, Li Mengru, Li Hong, Sun Limin, Xiao Ke, Yang Chao. A Blockchain Based Incentive Mechanism for Crowdsensing Applications[J]. Journal of Computer Research and Development, 2019, 56(3): 544-554. DOI: 10.7544/issn1000-1239.2019.20170670
    [7]He Haiwu, Yan An, Chen Zehua. Survey of Smart Contract Technology and Application Based on Blockchain[J]. Journal of Computer Research and Development, 2018, 55(11): 2452-2466. DOI: 10.7544/issn1000-1239.2018.20170658
    [8]Xiong Jinbo, Ma Rong, Niu Ben, Guo Yunchuan, Lin Li. Privacy Protection Incentive Mechanism Based on User-Union Matching in Mobile Crowdsensing[J]. Journal of Computer Research and Development, 2018, 55(7): 1359-1370. DOI: 10.7544/issn1000-1239.2018.20180080
    [9]Wang Bo, Huang Chuanhe, Yang Wenzhong, Dan Feng, and Xu Liya. An Incentive-Cooperative Forwarding Model Based on Punishment Mechanism in Wireless Ad Hoc Networks[J]. Journal of Computer Research and Development, 2011, 48(3): 398-406.
    [10]Yue Guangxue, Li Renfa, Chen Zhi, Zhou Xu. Analysis of Free-riding Behaviors and Modeling Restrain Mechanisms for Peer-to-Peer Networks[J]. Journal of Computer Research and Development, 2011, 48(3): 382-397.
  • Cited by

    Periodical cited type(2)

    1. 李硕,王馨爽. 多场景融合的码号数据分发架构及关键技术研究. 数据通信. 2024(06): 1-3+11 .
    2. 俞惠芳,李磊. 基于椭圆曲线签密的跨链医疗数据共享方案. 通信学报. 2024(12): 57-66 .

    Other cited types(0)

Catalog

    Article views (202) PDF downloads (89) Cited by(2)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return