Abstract:
Scientific cooperation is a very important form of academic achievement. Many high-level researches are achieved through cooperation. Researching the collaboration potential can provide guidance for scholars to choose collaborators and maximize the efficiency of scientific research. However, the current outbursts of big data have hindered the effective choice of collaborators. In order to solve the problem, based on scholar-paper big data, after features analysis and optimization and comprehensively considering individual attributes and related attributes of scholars' papers, institutions, research interests, etc., sample features from various dimensions such as paper title, paper rank, paper number, time and coauthor order are constructed. Taking journal or conference level of papers as the sample tags of collaborators sequence pairs, which indicates the potential of current cooperators and make use of the strong learning characteristics of the ensemble methods, a scientific collaborator potential prediction model based on ensemble learning classification method is proposed. After analyzing and constructing the feature set that corresponds to the problem of scientific collaborator potential prediction, classification method is adopted to solve the problem. In experiments, the accuracy, recall rate, and F1 score are much higher than those of traditional machine learning methods and can converge to high values (above 80%) with few samples and little time, indicating the superiority of the proposed model.