  1. 1(厦门大学信息学院 福建厦门 361005);2(福建省智慧城市感知与计算重点实验室(厦门大学) 福建厦门 361005);3(香港理工大学计算机系 香港 999077) (
  • 出版日期: 2021-08-01
Webpage Fingerprinting Identification on Tor: A Survey

Sun Xueliang1,2, Huang Anxin1,2, Luo Xiapu3, Xie Yi1,2   

  1. 1(School of Informatics, Xiamen University, Xiamen, Fujian 361005);2(Fujian Key Laboratory of Sensing and Computing for Smart City (Xiamen University), Xiamen, Fujian 361005);3(Department of Computing, The Hong Kong Polytechnic University, Hong Kong 999077)
  • Online: 2021-08-01
    This work was supported by the National Natural Science Foundation of China (61771017, 61671397, 61772438, 61972313) and Hong Kong Innovation and Technology Fund Project (GHP/052/19SZ).

摘要: 随着Web服务的发展,以匿名网络为代表的互联网隐私保护技术越来越受到重视.用户可以通过匿名网络隐藏真实的访问目标,在互联网上匿名浏览网页.然而网页指纹识别仍能通过监听和分析网络流量判断出用户真实的访问目标,从而破坏匿名性.因此,网页指纹识别的方法也能对匿名网络实施监管和审查,避免不法分子滥用隐私保护技术进行非法活动或掩盖罪行.无论从隐私保护还是网络监管的角度来说,网页指纹识别都是值得重点关注的技术手段.在介绍网页指纹识别的概念和发展基础上,针对最有代表性的匿名系统Tor,重点阐述和分析面向单标签和面向多标签的2类网页指纹识别,并讨论其工作原理、性能特点和局限性(例如过于简化的研究假设和缺乏系统的实验评估).最后总结和展望网页指纹识别的未来发展方向.

关键词: 网页指纹识别, Tor匿名通信, 隐私保护, 流量分析, 机器学习, 网络监管

Abstract: With the prosperous development of Web services, protecting Web-surfing privacy has become a significant concern to society. Various protection techniques (e.g., anonymous communication networks) have been proposed to help users hide the real access targets and anonymously browse the Internet. However, Webpage fingerprinting (WF) identifications, through monitoring and analyzing network traffic, can still determine whether a Web page is visited by exploiting network traffic features, thus jeopardizing the anonymity. On the other hand, law enforcement agencies can leverage the methods of WF identification to monitor anonymous networks to prevent abusing them for carrying out illegal activities or covering up crimes. Therefore, WF identification is a significant and noteworthy technique for privacy protection and network supervision. In this survey, we first introduce the concept and development of WF identifications, and then focus on two kinds of WF identifications on Tor, a widely used anonymous network, including single-tag oriented identifications and multi-tag oriented identifications. In particular, the characteristics of these WF identifications are analyzed and these WF limitations are pointed out, such as simplistic assumptions and insufficient experiments for systematical evaluation. Finally, future research directions for WF identifications are concluded.

Key words: Webpage fingerprinting identification, Tor anonymous communication, privacy preserving, traffic analysis, machine learning, network supervision