Abstract:
Malicious domains play a vital role in illicit online activities. Effectively detecting the malicious domains can significantly decrease the damage of evil attacks. In this paper, we propose CoDetector, a novel technique to detect malicious domains based on the co-occurrence relationships of domains in DNS (domain name system) queries. We observe that DNS queries are not isolated, whereas co-occur with each other. We base it design on the intuition that domains that tend to co-occur in DNS traffic are strongly associated and are likely to be in the same property (i.e., malicious or benign). Therefore, we first perform coarse-grained clustering of DNS traffic based on the chronological order of DNS queries. The domains co-occurring with each other will be clustered. Then, we design a mapping function that automatically projects every domain into a low-dimensional feature vector while maintaining their co-occurrence relationships. Domains that co-occur with each others are mapped to similar vectors while domains that not co-occur are mapped to distant vectors. Finally, based on the learned feature representations, we train a classifier over a labeled dataset and further apply it to detect unknown malicious domains. We evaluate CoDetector using real-world DNS traffic collected from an enterprise network over two months. The experimental results show that CoDetector can effectively detect malicious domains (91.64% precision and 96.04% recall).