ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2020, Vol. 57 ›› Issue (11): 2442-2455.doi: 10.7544/issn1000-1239.2020.20190649

• 信息处理 • 上一篇    下一篇



  1. 1(四川大学计算机学院 成都 610065);2(西北民族大学数学与计算机科学学院 兰州 730030) (
  • 出版日期: 2020-11-01
  • 基金资助: 

Search of Students with Similar Lifestyle Based on Campus Behavior Information Network

Wang Xin’ao1, Duan Lei1, Cui Dingshan1, Lu Li1, Dun Yijie2, Qin Ruiqi1   

  1. 1(School of Computer Science, Sichuan University, Chengdu 610065);2(School of Mathematics and Computer Science, Northwest Minzu University, Lanzhou 730030)
  • Online: 2020-11-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61972268, 61572332), the National Key Research and Development Program of China (2018YFB0704301), the Key Rsearch and Dvelopment Program of Sichuan Province (2019YFG0491), and Sichuan Higher Education Talent Training Quality and Teaching Reform Project (JG2018-92).

摘要: 利用大数据分析、深度学习等新一代信息技术,通过掌握学生的兴趣、爱好、生活习惯等,提高人才培养质量已成为当前重要的科学研究问题.寻找具有相似生活习惯的学生对于心理状况及学业状况预警都有着积极的作用.已有的相似生活习惯学生搜索算法无法解释学生之间相似的原因,并且无法拓展性地融合更多数据源.为此提出了基于校园行为信息网络的生活习惯相似学生搜索算法SCALE(similar campus lifestyle miner).SCALE算法通过带约束的元路径计算相似度.SCALE不仅能保留原始数据中的相似语义,同时可以在此基础上拓展性地融合更多数据源.进一步对算法各部分解耦,为SCALE算法设计了并行化策略以提高执行效率.通过真实校园环境数据集上的实验,验证了SCALE算法的有效性和执行效率.

关键词: 校园行为信息网络, 异构信息网络, 学生行为分析, 元路径, 相似学生搜索

Abstract: It is important to keep track of both the psychological and academic status of students in campus. Generally, student data covers a wide range of kinds such as students’ interests, hobbies, and lifestyles, and these data can be collected via smart devices such as student e-cards by many campuses. With the rapid development of new generation of information technology, in recent years, researchers have explored novel ways to improve the quality of talent cultivation by utilizing the student data, such as applying big data analysis on the data to discover subtle but meaningful information as the guidance for better student management. Among such research, search of students with similar lifestyles can exert positive effect on the improvement of student management, as potential and insightful information can be found and may further provide some warnings for students at an early stage if anything unusual is found. Existing algorithms for searching students with similar lifestyle have two deficiencies. Firstly, they cannot explain the similarities between students because related semantic information is lost in the searching process. Secondly, they fail to integrate multiple data sources, while the student behavioral data is growing dynamically and only using one dataset may lead to biased results. To break these limitations, we first propose the concept of campus behavior information network to represent student behaviors in campus. Next, based on the constructed campus behavior information network, an algorithm named SCALE is proposed for similar campus lifestyle mining. SCALE calculates the student similarity by specific meta-paths with constraints. SCALE is strong and unique, not only in keeping the similarity semantics of the original data, but also in extensively integrating multiple data sources in a scalable way while retaining the original results of calculation. Due to the large scale of datasets, parallel strategy is further designed and applied to SCALE for the sake of efficiency. Through extensive experiments on real campus behavior datasets, the effectiveness and execution efficiency of the SCALE are verified.

Key words: campus behavior information network, heterogeneous information network, student behavior analysis, meta path, similar student search