Detecting Malicious Domains Using Co-Occurrence Relation Between DNS Query
-
摘要: 恶意域名在网络非法攻击活动中承担重要的角色.恶意域名检测能够有效地减少攻击活动所带来的经济损失.提出CoDetector恶意域名检测模型,通过挖掘域名请求之间潜在的时空伴随关系进行恶意域名检测.研究发现域名请求之间存在彼此伴随关系,而并非相互独立.因此,彼此伴随的域名之间存在紧密关联,偏向于同时是正常域名或恶意域名.1)利用域名请求的先后时间顺序对域名数据进行粗粒度的聚类操作,将彼此伴随出现的域名划分到同一簇中;2)采用嵌入学习构建映射函数,在保留域名伴随关系的同时将每一个域名投影成低维空间的特性向量;3)结合有标记的数据,训练恶意域名检测分类器,用于检测更多未知恶意域名.实验结果表明,CoDetector能够有效地检测恶意域名,具有91.64%检测精度和96.04%召回率.Abstract: Malicious domains play a vital role in illicit online activities. Effectively detecting the malicious domains can significantly decrease the damage of evil attacks. In this paper, we propose CoDetector, a novel technique to detect malicious domains based on the co-occurrence relationships of domains in DNS (domain name system) queries. We observe that DNS queries are not isolated, whereas co-occur with each other. We base it design on the intuition that domains that tend to co-occur in DNS traffic are strongly associated and are likely to be in the same property (i.e., malicious or benign). Therefore, we first perform coarse-grained clustering of DNS traffic based on the chronological order of DNS queries. The domains co-occurring with each other will be clustered. Then, we design a mapping function that automatically projects every domain into a low-dimensional feature vector while maintaining their co-occurrence relationships. Domains that co-occur with each others are mapped to similar vectors while domains that not co-occur are mapped to distant vectors. Finally, based on the learned feature representations, we train a classifier over a labeled dataset and further apply it to detect unknown malicious domains. We evaluate CoDetector using real-world DNS traffic collected from an enterprise network over two months. The experimental results show that CoDetector can effectively detect malicious domains (91.64% precision and 96.04% recall).
-
Keywords:
- DNS queries /
- co-occurrence /
- malicious domains /
- DNS cut /
- tensor representation /
- domain classification
-
-
期刊类型引用(7)
1. 杨秀璋,武帅,宋籍文,廖文婧,任天舒,刘建义. 基于LDA和关系图谱的数据治理文献主题演化研究. 信息技术与信息化. 2022(08): 6-12 . 百度学术
2. 黄飞杰,张卫东,侯石鹏,宋红文. 基于GSP算法的卷烟消费者研究. 信息与电脑(理论版). 2022(16): 58-60 . 百度学术
3. 张瑾,朱桂祥,王宇琛,郑烁佳,陈镜潞. 基于异质图表达学习的跨境电商推荐模型. 电子与信息学报. 2022(11): 4008-4017 . 百度学术
4. 冯晨娇,宋鹏,王智强,梁吉业. 一种基于3因素概率图模型的长尾推荐方法. 计算机研究与发展. 2021(09): 1975-1986 . 本站查看
5. 牛俊洁,崔忠伟,赵晨洁,王永金,吴恋. 个性化旅游推荐技术研究及发展综述. 物联网技术. 2020(03): 86-88+91 . 百度学术
6. 史亚奇. 基于人性化特征的旅游地智能推荐系统. 现代电子技术. 2020(11): 183-186 . 百度学术
7. 张如花,屈正庚. 基于AHP的旅游网站评价体系研究. 甘肃科学学报. 2019(05): 32-36 . 百度学术
其他类型引用(11)
计量
- 文章访问数: 1660
- HTML全文浏览量: 9
- PDF下载量: 865
- 被引次数: 18