ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2019, Vol. 56 ›› Issue (9): 1843-1850.doi: 10.7544/issn1000-1239.2019.20180847

• 人工智能 • 上一篇    下一篇

基于PU与生成对抗网络的POI定位算法

田继伟,王劲松,石凯   

  1. (天津理工大学计算机科学与工程学院 天津 300384) (天津市智能计算及软件新技术重点实验室(天津理工大学) 天津 300384) (计算机病毒防治技术国家工程实验室(天津理工大学) 天津 300457) (jiwei.tian@foxmail.com)
  • 出版日期: 2019-09-10
  • 基金资助: 
    国家自然科学基金项目(61272450);天津市自然科学基金重点项目(18JCZDJC30700);天津市科技计划项目(17ZXHLSY00060)

Positive and Unlabeled Generative Adversarial Network on POI Positioning

Tian Jiwei, Wang Jinsong, Shi Kai   

  1. (School of Computer Science and Engineering, Tianjin University of Technology, Tianjin 300384) (Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology (Tianjin University of Technology), Tianjin 300384) (National Engineering Laboratory for Computer Virus Prevention and Control Technology (Tianjin University of Technology), Tianjin 300457)
  • Online: 2019-09-10
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61272450), the Key Program of the Natural Science Foundation of Tianjin (18JCZDJC30700), and the Science and Technology Project of Tianjin (17ZXHLSY00060).

摘要: 随着智能移动设备的快速普及,人们对基于位置的社交网络服务的依赖性越来越高.但是,由于数据采集成本昂贵以及现有数据采集技术的缺陷,基于小样本数据挖掘的兴趣点(point of interest, POI)定位已经成为了一种挑战.尽管已经有一些POI定位方面的研究,但是现有的方法不能解决正样本数据不足的问题.提出一种基于PU与生成对抗网络(positive and unlabeled generative adversarial network, puGAN)的模型,采用PU学习和生成对抗网络相结合的方式挖掘数据的隐藏特征,生成伪正样本弥补数据不足的问题,并校正无标签样本数据的分布,从而训练出有效的POI判别模型.通过分析ROC曲线以及训练误差和测试误差在迭代过程中的变化和关系来比较不同模型在实验场景下的效果.结果表明,puGAN模型可以有效解决数据样本不足的问题,进而提高POI定位的准确性.

关键词: 数据挖掘, 兴趣点, 定位, PU, 生成对抗网络

Abstract: With the rapid popularization of smart mobile devices, people rely more and more on location-based social networking service (LBSNS). Due to the high cost of data acquisition, point of interest (POI) positioning based on small data collection has become a big challenge. Recent research focuses on received signal strength (RSS) and simultaneous localization methods. Although there has been some research on POI positioning, the existing approaches do not discuss the problem of insufficient positive training samples. Based on the truthful positive data and a large amount of unlabeled data, a novel approach, called positive and unlabeled generative adversarial network (puGAN), is proposed. Firstly, we use positive and unlabeled method along with the generative adversarial network to effectively mine the hidden features of data. Secondly, based on the hidden features, we calibrate the positive data and unlabeled data, then treat them as the input of the discriminator. Finally, with the minimax of generator and discriminator, a POI-discriminator model is obtained. We evaluate the new method by analyzing ROC curve and the relationship between training error and testing error. The results of experiments show that the method we proposed can effectively solve the problem of insufficient positive samples and outperforms the traditional models of POI positioning, including one-class classifier, SVM and neural network.

Key words: data mining, point of interest (POI), positioning, positive and unlabeled, generative adversarial network (GAN)

中图分类号: