Abstract:
With the increasing popularity of intelligent mobile terminals, people increasingly use social network platforms (such as Twitter, Sina Weibo, etc.) for information acquisition, comments, and exchanges. Although GPS devices can accurately obtain location information, many users do not directly share their location information for privacy and security considerations. Therefore, obtaining the geographic location of online users has become an important topic in both academia and industry and is the foundation of many downstream applications, such as location-based targeted advertising, event/location recommendations, early warning of natural disasters or diseases, and criminal tracking, etc. We survey in detail the methods, data types, evaluation metrics, and fundamental algorithms for predicting the geographic location of social network users. First, we discriminate different online user geolocation tasks and corresponding evaluation protocols. Subsequently, we assess the data structures and fusion methods used for individual geolocation tasks. Besides, we analyze the existing information extraction and feature selection approaches, as well as their advantages and disadvantages. Moreover, we provide a taxonomy to categorize existing user geolocation models and algorithms, followed by a thorough analysis of different methods from three aspects: geographic dictionary, traditional machine learning, deep learning and graph neural networks. Finally, we summarize the difficulties and challenges in user location prediction while outlining the possible research trend and opportunities to shed light on future work in this field.