基于校园行为信息网络的生活习惯相似学生搜索

王新澳; 段磊; 崔丁山; 卢莉; 顿毅杰; 秦蕊琦

doi:10.7544/issn1000-1239.2020.20190649

基于校园行为信息网络的生活习惯相似学生搜索

Search of Students with Similar Lifestyle Based on Campus Behavior Information Network

摘要

摘要: 利用大数据分析、深度学习等新一代信息技术，通过掌握学生的兴趣、爱好、生活习惯等，提高人才培养质量已成为当前重要的科学研究问题.寻找具有相似生活习惯的学生对于心理状况及学业状况预警都有着积极的作用.已有的相似生活习惯学生搜索算法无法解释学生之间相似的原因，并且无法拓展性地融合更多数据源.为此提出了基于校园行为信息网络的生活习惯相似学生搜索算法SCALE(similar campus lifestyle miner).SCALE算法通过带约束的元路径计算相似度.SCALE不仅能保留原始数据中的相似语义，同时可以在此基础上拓展性地融合更多数据源.进一步对算法各部分解耦，为SCALE算法设计了并行化策略以提高执行效率.通过真实校园环境数据集上的实验，验证了SCALE算法的有效性和执行效率.

Abstract: It is important to keep track of both the psychological and academic status of students in campus. Generally, student data covers a wide range of kinds such as students’ interests, hobbies, and lifestyles, and these data can be collected via smart devices such as student e-cards by many campuses. With the rapid development of new generation of information technology, in recent years, researchers have explored novel ways to improve the quality of talent cultivation by utilizing the student data, such as applying big data analysis on the data to discover subtle but meaningful information as the guidance for better student management. Among such research, search of students with similar lifestyles can exert positive effect on the improvement of student management, as potential and insightful information can be found and may further provide some warnings for students at an early stage if anything unusual is found. Existing algorithms for searching students with similar lifestyle have two deficiencies. Firstly, they cannot explain the similarities between students because related semantic information is lost in the searching process. Secondly, they fail to integrate multiple data sources, while the student behavioral data is growing dynamically and only using one dataset may lead to biased results. To break these limitations, we first propose the concept of campus behavior information network to represent student behaviors in campus. Next, based on the constructed campus behavior information network, an algorithm named SCALE is proposed for similar campus lifestyle mining. SCALE calculates the student similarity by specific meta-paths with constraints. SCALE is strong and unique, not only in keeping the similarity semantics of the original data, but also in extensively integrating multiple data sources in a scalable way while retaining the original results of calculation. Due to the large scale of datasets, parallel strategy is further designed and applied to SCALE for the sake of efficiency. Through extensive experiments on real campus behavior datasets, the effectiveness and execution efficiency of the SCALE are verified.

HTML全文

参考文献(0)

施引文献

资源附件(0)