ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2015, Vol. 52 ›› Issue (5): 1005-1013.doi: 10.7544/issn1000-1239.2015.20131552

Previous Articles     Next Articles

Chinese Text Deception Detection Based on Ensemble Learning

Zhang Hu, Tan Hongye, Qian Yuhua, Li Ru, Chen Qian   

  1. (School of Computer & Information Technology, Shanxi University, Taiyuan 030006)
  • Online:2015-05-01

Abstract: Deception detection is important in the field of information security. Existing researches show that one third of the interpersonal communication involves the potential deceptions, and there are large amounts of deceptive messages in the more and more Web information. If the deception is potentially dangerous to people's life, the survival of enterprise and the stability of the country, then the negligence of deception may lead to incalculable loss. In the massive amounts of information the scale of the non-deceptive texts is much larger than the scale of the deceptive texts, so people remain unsuccessful and inefficient in detecting those deceptive messages by the existing methods, and it is desirable to create an automated method which could help people flag the possible deceptive messages. In this paper, we built a deception detection model based on ensemble learning to solve the imbalance of the existing data sets. Firstly a novel bisecting k-means method is proposed to cut the training sample set, and the separate classifiers are trained by using each pair of positive and negative samples, and then each test sample category value is calculated by the classifiers, and finally a novel min-max modular approach is used to integrate each category result. Experimental results verify the effectiveness of this method.

Key words: deception, deception detection, ensemble learning, cutting samples, min-max modular support vector machine (M3-SVM)

CLC Number: