• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Fu Hao, Long Chun, Gong Liangyi, Wei Jinxia, Huang Pan, Lin Yanzhong, Sun Degang. Malicious Domain Detection Technology Based on Semantic Graph Learning[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440375
Citation: Fu Hao, Long Chun, Gong Liangyi, Wei Jinxia, Huang Pan, Lin Yanzhong, Sun Degang. Malicious Domain Detection Technology Based on Semantic Graph Learning[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440375

Malicious Domain Detection Technology Based on Semantic Graph Learning

Funds: This work was supported by the Cyber Security and Informatization Project of Chinese Academy of Sciences (CAS-WX2022GC-04) and the Youth Innovation Promotion Association, Chinese Academy of Sciences (2022170).
More Information
  • Author Bio:

    Fu Hao: born in 1999. PhD candidate. His main research interests include malicious domain name detection, network traffic analysis, and machine learning

    Long Chun: born in 1979. PhD, senior engineer, PhD supervisor. Member of CCF. His main research interests include artificial intelligence based network unknown attack detection, malicious domain name detection, and network traffic analysis

    Gong Liangyi: born in 1987. PhD, senior engineer, master supervisor. Member of CCF. His main research interests include network attack detection, malicious domain name detection, Web attack analysis, and machine learning

    Wei Jinxia: born in 1987. PhD, senior engineer, master supervisor. Her main research interests include artificial intelligence-based network unknown attack detection, malicious domain name detection, and network traffic analysis

    Huang Pan: born in 2000. Bachelor, engineer. His main research interests include Web attack detection, penetration testing, and malicious domain name analysis

    Lin Yanzhong: born in 1973. Master, vice president of Coremail Technology. His main research interests are email scaling and anti-phishing emails

    Sun Degang: born in 1970. PhD, senior engineer, PhD supervisor. His main research interests include communication security, network architecture, network and system security

  • Received Date: May 30, 2024
  • Accepted Date: March 02, 2025
  • Available Online: March 02, 2025
  • Malicious domain name detection is a critical component of network intrusion detection systems, enabling the rapid identification of network attacks through domain name requests. Machine learning methods overcome the limitations of blacklist mechanisms and improve detection accuracy. However, challenges such as the high variability of domain name structures and the complexity of real-world environments lead to low detection efficiency and poor robustness in practical applications. To address these issues, a malicious domain name detection technology based on domain name semantic graph learning is proposed, leveraging semantic graph association analysis for efficient detection. Specifically, 12 months of domain request data from China Science and Technology Network is first collected, encompassing 3.33 billion access records, including more than 6.5 million malicious domain name entries across 284 attack types. Semantic analysis reveals significant differentiation between domain categories, yet considerable feature overlap in certain regions degrades classifier performance. To tackle this, a domain association graph model based on character-level semantic similarity is proposed. By integrating features of neighboring domains, the model enhances semantic representations in overlapping regions, thereby improving detection performance. The method includes filtering noise characters through structural similarity analysis, constructing a dynamic domain semantic graph using an online aggregation algorithm, and training a multi-head attention-based message-passing graph model with node-degree-weighted samples. Finally, a multi-layer neural network classifier is employed for malicious domain detection. Experimental results demonstrate that the proposed method achieves an average precision rate of 96% and a recall rate of 97% on the dataset of different types of malicious domain names. Furthermore, the model exhibits strong online adaptability, achieving high detection rate and robustness.

  • [1]
    Versign. Domain names: Introducing the all new dnib. com [EB/OL]. (2024-12-07)[2024-12-25]. https://www.verisign.com/en_US/domain-names/dnib/index.xhtml
    [2]
    章坚武,安彦军,邓黄燕. DNS攻击检测与安全防护研究综述[J]. 电信科学,2022,38(9):1−17

    Zhang Jianwu, An Yanjun, Deng Huangyan. A survey on DNS attack detection and security protection[J]. Telecommunications Science, 2022, 38(9): 1−17(in Chinese)
    [3]
    Porras P, Saïdi H, Yegneswaran V. A foray into conficker's logic and rendezvous points[C/OL] //Proc of the 2nd USENIX Conf on Large-scale Exploits and Emergent Threats: Botnets, Spyware, Worms, and More. Berkeley, CA: USENIX Association, 2009[2025-01-22]. https://dl.acm.org/doi/10.5555/1855676.1855683
    [4]
    Gong Liangyi, Li Zhenhua, Wang Hongyi, et al. Overlay-based android malware detection at market scales: Systematically adapting to the new technological landscape[J]. IEEE Transactions on Mobile Computing, 2021, 21(12): 4488−4501
    [5]
    赵凡,赵宏,常兆斌. 基于迁移学习的小样本恶意域名检测[J]. 计算机工程与设计,2022,43(12):3381−3387

    Zhao Fan, Zhao Hong, Chang Zhaobin. Small sample malicious domain names detection method based on transfer learning[J]. Computer Engineering and Design, 2022, 43(12): 3381−3387 (in Chinese)
    [6]
    Gong Liangyi, Li Zhenhua, Qian Feng, et al. Experiences of landing machine learning onto market-scale mobile malware detection[C/OL] //Proc of the 15th European Conf on Computer Systems. New York: ACM, 2020[2025-01-22]. https://doi.org/10.1145/3342195.3387530
    [7]
    张清,张文川,冉兴程. 基于CNN-BiLSTM和注意力机制的恶意域名检测[J]. 中国电子科学研究院学报,2022,17(9):848−855

    Zhang Qing, Zhang Wenchuan, Ran Xingcheng. Malicious domain names detection based on CNN-BiLSTM and attention mechanism[J]. Journal of China Academy of Electronics and Information Technology, 2022, 17(9): 848−855 (in Chinese)
    [8]
    袁福祥,王琤,刘粉林,等. 基于IP分布及请求响应时间的恶意fast-flux域名检测算法[J]. 信息工程大学学报,2017,18(5):601−606 doi: 10.3969/j.issn.1671-0673.2017.05.017

    Yuan Fuxiang, Wang Zheng, Liu Fenlin, et al. Malicious fast-flux domains detection algorithm based on IP distribution and request response time[J]. Journal of Information Engineering University, 2017, 18(5): 601−606 (in Chinese) doi: 10.3969/j.issn.1671-0673.2017.05.017
    [9]
    彭成维,云晓春,张永铮等. 一种基于域名请求伴随关系的恶意域名检测方法[J]. 计算机研究与发展,2019,56(6):1263−1274 doi: 10.7544/issn1000-1239.2019.20180481

    Peng Chengwei, Yun Xiaochun, Zhang Yongzheng, et al. Detecting malicious domains using co-occurrence relation between DNS query[J]. Journal of Computer Research and Development, 2019, 56(6): 1263−1274 (in Chinese) doi: 10.7544/issn1000-1239.2019.20180481
    [10]
    Gong Liangyi, Lin Hao, Li Zhenhua, et al. Systematically landing machine learning onto market-scale mobile malware detection[J]. IEEE Transactions on Parallel and Distributed Systems, 2020, 32(7): 1615−1628
    [11]
    Yadav S, Reddy A K K, Reddy A L N, et al. Detecting algorithmically generated malicious domain names[C]//Proc of the 10th ACM SIGCOMM Conf on Int Measurement. New York: ACM, 2010: 48−61
    [12]
    Cucchiarelli A, Morbidoni C, Spalazzi L, et al. Algorithmically generated malicious domain names detection based on n-grams features[J]. Expert Systems with Applications, 2021, 170: 114554 doi: 10.1016/j.eswa.2020.114554
    [13]
    Zhao Hong, Chen Zhiwen, Yan Rongjing. Malicious domain names detection algorithm based on statistical features of URLs[C]//Proc of the 25th IEEE Int Conf on Computer Supported Cooperative Work in Design (CSCWD). Piscataway, NJ: IEEE, 2022: 11−16
    [14]
    Nguyen T D, CAO T D, Nguyen L G. DGA botnet detection using collaborative filtering and density-based clustering[C]//Proc of the 6th Int Symp on Information and Communication Technology. New York: ACM, 2015: 203−209
    [15]
    Can N V, Tu D N, Tuan T A, et al. A new method to classify malicious domain name using neutrosophic sets in DGA botnet detection[J]. Journal of Intelligent & Fuzzy Systems, 2020, 38(4): 4223−4236
    [16]
    Bilge L, Sen S, Balzarotti D, et al. Exposure: A passive DNS analysis service to detect and report malicious domains[J]. ACM Transactions on Information and System Security (TISSEC), 2014, 16(4): 1−28
    [17]
    Manadhata P, Yadav S, Rao P, et al. Detecting malicious domains via graph inference[C]//Proc of the 2014 Workshop on Artificial Intelligent and Security Workshop. New York: ACM, 2014: 59−60
    [18]
    Sun Xiaoqing, Tong Mingkai, Yang Jiahai, et al. HinDom: A robust malicious domain detection system based on heterogeneous information network with transductive classification[C]// Proc of the 22nd Int Symp on Research in Attacks, Intrusions and Defenses (RAID 2019). Berkeley, CA: USENIX Association, 2019: 399−412
    [19]
    Cheng Yanan, Chai Tingting, Zhang Zhaoxin, et al. Detecting malicious domain names with abnormal whois records using feature-based rules[J]. The Computer Journal, 2022, 65(9): 2262−2275 doi: 10.1093/comjnl/bxab062
    [20]
    Antonakakis M, Perdisci R, Nadji Y, et al. From throw-away traffic to bots: Detecting the rise of DGA-based malware[C]//Proc of the 21st USENIX Security Symp (USENIX Security 12). Berkeley, CA: USENIX Association, 2012: 491−506
    [21]
    Vinayakumar R, Soman K P, Poornachandran P. Detecting malicious domain names using deep learning approaches at scale[J]. Journal of Intelligent & Fuzzy Systems, 2018, 34(3): 1355−1367
    [22]
    Park K H, Song H M, Do Yoo J, et al. Unsupervised malicious domain detection with less labeling effort[J]. Computers & Security, 2022, 116: 102662
    [23]
    Ma Donglin, Zhang Shuhuan, Kong Fanqi, et al. Malicious domain name detection based on Doc2Vec and hybrid network[C]//IOP Conf Series: Earth and Environmental Science, 693: Proc of the 8th Annual Int Conf on Geo-Spatial Knowledge and Intelligence. Princeton, NJ: IOP Publishing, 2021: 12089
    [24]
    Jiang Yanshu, Jia Mingqi, Zhang Biao, et al. Malicious domain name detection model based on CNN-GRU-attention[C]//Proc of the 33rd Chinese Control Adecision Conf (CCDC). Piscataway, NJ: IEEE, 2021: 1602−1607
    [25]
    Yang Luhui, Liu Guangjie, Dai Yuewei, et al. Detecting stealthy domain generation algorithms using heterogeneous deep neural network framework[J]. IEEE Access, 2020, 8: 82876−82889 doi: 10.1109/ACCESS.2020.2988877
    [26]
    王伟,罗鹏宇. 基于机器学习建模的DGA恶意域名检测[J]. 通信技术,2022,55(6):753−761 doi: 10.3969/j.issn.1002-0802.2022.06.012

    Wang Wei, Luo Pengyu. DGA malicious domain detection based on machine learning modeling[J]. Communications Technology, 2022, 55(6): 753−761(in Chinese) doi: 10.3969/j.issn.1002-0802.2022.06.012
    [27]
    刘善玲,祁正华. 基于特征多样化的恶意域名检测[J]. 南京邮电大学学报:自然科学版,2021,41(6):95−100

    Liu Shanling, Qi Zhenghua. Malicious domain detection based on diversified characteristics[J]. Journal of Nanjing University of Posts and Telecommunications: Natural Science Edition, 2021, 41(6): 95−100(in Chinese)
    [28]
    蒋鸿玲,戴俊伟. DGA恶意域名检测方法[J]. 北京信息科技大学学报:自然科学版,2019,34(5):45-50

    Jiang Hongling, Dai Junwei. DGA malicious domain name detection method[J]. Journal of Beijing Information Science & Technology University: Natural Science Edition, 2019, 34(5): 45-50 (in Chinese)
    [29]
    张洋,柳厅文,沙泓州,等. 基于多元属性特征的恶意域名检测[J]. 计算机应用,2016,36(4):941−944 doi: 10.11772/j.issn.1001-9081.2016.04.0941

    Zhang Yang, Liu Tingwen, Sha Hongzhou, et al. Malicious domain detection based on multiple-dimensional features[J]. Journal of Computer Applications, 2016, 36(4): 941−944(in Chinese) doi: 10.11772/j.issn.1001-9081.2016.04.0941
    [30]
    Vaswani A, Shazeer N, Paramar N, et al. Attention is all you need[C] //Proc of the 31st Int Conf on Neural Information Processing Systems (NIPS'17). New York: ACM, 2017: 6000−6010
    [31]
    Yang Luhui, Liu Guangjie, Wang Jinwei, et al. A semantic element representation model for malicious domain name detection[J]. Journal of Information Security and Applications, 2022, 66: 103148 doi: 10.1016/j.jisa.2022.103148
    [32]
    Mikolov T, Chenkai, Corrado G, et al. Efficient estimation of word representations in vector space[J]. arXiv preprint, arXiv: 1301.378, 2013
    [33]
    Schüppen S, Teubert D, Herrmann P, et al. FANCI: Feature-based automated NXDomain classification and intelligence[C]//Proc of the 27th USENIX Security Symp (USENIX Security 18). Berkeley, CA: USENIX Association, 2018: 1165−1181
    [34]
    Xu Congyuan, Shen Jizhong, Du Xin. Detection method of domain names generated by DGAs based on semantic representation and deep neural network[J]. Computers & Security, 2019, 85: 77−88
    [35]
    Le F, Ortiz J, Verma D, et al. Policy-based identification of IoT devices’ vendor and type by DNS traffic analysis[J/OL]. Policy-Based Autonomic DataGovernance, 2019: 180−201[2025-01-22]. https://doi.org/10.1007/978-3-030-17277-0_10
    [36]
    魏金侠,龙春,付豪,等. 基于增强嵌入特征超图学习的恶意域名检测方法[J]. 计算机研究与发展,2024,61(9):2334−2346 doi: 10.7544/issn1000-1239.202330117

    Wei Jinxia, Long Chun, Fu Hao, et al. Malicious domain name detection method based on enhanced embedded feature hypergraph learning[J]. Journal of Computer Research and Development, 2024, 61(9): 2334−2346 (in Chinese) doi: 10.7544/issn1000-1239.202330117
  • Related Articles

    [1]Yu Ruiqi, Zhang Xinyun, Ren Shuang. A Review of Quantum Machine Learning Algorithms Based on Variational Quantum Circuit[J]. Journal of Computer Research and Development, 2025, 62(4): 821-851. DOI: 10.7544/issn1000-1239.202330979
    [2]Qian Luoxiong, Chen Mei, Ma Xueyan, Zhang Chi, Zhang Jinhong. Multi-View Clustering Based on Adaptive Tensor Singular Value Shrinkage[J]. Journal of Computer Research and Development, 2025, 62(3): 733-750. DOI: 10.7544/issn1000-1239.202330785
    [3]Pan Shijie, Gao Fei, Wan Linchun, Qin Sujuan, Wen Qiaoyan. Quantum Algorithm for Spectral Regression[J]. Journal of Computer Research and Development, 2021, 58(9): 1835-1842. DOI: 10.7544/issn1000-1239.2021.20210366
    [4]Yu Runlong, Zhao Hongke, Wang Zhong, Ye Yuyang, Zhang Peining, Liu Qi, Chen Enhong. Negatively Correlated Search with Asymmetry for Real-Parameter Optimization Problems[J]. Journal of Computer Research and Development, 2019, 56(8): 1746-1757. DOI: 10.7544/issn1000-1239.2019.20190198
    [5]Zhang Cheng, Wang Dong, Shen Chuan, Cheng Hong, Chen Lan, Wei Sui. Separable Compressive Imaging Method Based on Singular Value Decomposition[J]. Journal of Computer Research and Development, 2016, 53(12): 2816-2823. DOI: 10.7544/issn1000-1239.2016.20150414
    [6]Ning Xin, Li Weijun, Li Haoguang, Liu Wenjie. Uncorrelated Locality Preserving Discriminant Analysis Based on Bionics[J]. Journal of Computer Research and Development, 2016, 53(11): 2623-2629. DOI: 10.7544/issn1000-1239.2016.20150630
    [7]Zhao Feng, Huang Qingming, Gao Wen. An Image Matching Algorithm Based on Singular Value Decomposition[J]. Journal of Computer Research and Development, 2010, 47(1): 23-32.
    [8]Lin Yuan, Luo Siwei, and Yang Liner. Recommendation-Based Grid Resource Matching Algorithm[J]. Journal of Computer Research and Development, 2009, 46(11): 1814-1820.
    [9]Sun Yong, Wu Bo, and Feng Yanpeng. A Policy-and Value- Iteration Algorithm for POMDP[J]. Journal of Computer Research and Development, 2008, 45(10): 1763-1768.
    [10]Zhang Shihui, Kong Lingfu, and Feng Liang. An Improved Hestenes SVD Method and Its Parallel Computing and Application in Parallel Robot[J]. Journal of Computer Research and Development, 2008, 45(4): 716-724.
  • Cited by

    Periodical cited type(3)

    1. 白婷,刘轩宁,吴斌,张梓滨,徐志远,林康熠. 基于多粒度特征交叉剪枝的点击率预测模型. 计算机研究与发展. 2024(05): 1290-1298 . 本站查看
    2. 李莎莎,崔铁军. 系统故障演化过程中故障事件发生概率的修正方法研究. 安全与环境学报. 2024(06): 2068-2074 .
    3. 苗忠琦,童向荣. 一种偏差和方差双降的双鲁棒去偏学习模型. 小型微型计算机系统. 2024(11): 2663-2672 .

    Other cited types(1)

Catalog

    Article views (45) PDF downloads (18) Cited by(4)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return