Word Semantic Similarity Measurement Based on Nave Bayes Model

Wang Junhua; Zuo Wanli; Yan Zhao

doi:10.7544/issn1000-1239.2015.20140383

Wang Junhua, Zuo Wanli, Yan Zhao. Word Semantic Similarity Measurement Based on Nave Bayes Model[J]. Journal of Computer Research and Development, 2015, 52(7): 1499-1509. DOI: 10.7544/issn1000-1239.2015.20140383

Citation:

Word Semantic Similarity Measurement Based on Nave Bayes Model

Graphical Abstract

Graphical Abstract

Abstract

Abstract

Measuring semantic similarity between words is a classical and hot problem in nature language processing, the achievement of which has great impact on many applications such as word sense disambiguation, machine translation, ontology mapping, computational linguistics, etc. A novel approach is proposed to measure words semantic similarity by combining Nave Bayes model with knowledge base. To start, extract attribute variables based on WordNet; then, generate conditional probability distribution by statistics and piecewise linear interpolation technique; after that, obtain posteriori through Bayesian inference; at last, quantify word semantic similarity. The main contributions are definition of distance and depth between word pairs with small amount of computation and high degree of distinguishing the characteristics from words’ sense, and word semantic similarity measurement based on nave Bayesian model. On benchmark data set R&G(65), the experiment is conducted through 5-fold cross validation. The sample Pearson correlation between test results and human judgments is 0.912, with 0.4% improvement over existing best practice, and 7%~13% improvement over classical methods. Spearman correlation between test results and human judgments is 0.873, with 10%~20% improvement over classical methods. And the computational complexity of the method is as efficient as the classical methods, which indicates that integrating Nave Bayes model with knowledge base to measure word semantic similarity is reasonable and effective.

FullText(HTML)

References (0)

Cited By

Turn off MathJax

Article Contents

Word Semantic Similarity Measurement Based on Nave Bayes Model

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content