Advanced Search
    Wang Junhua, Zuo Wanli, Yan Zhao. Word Semantic Similarity Measurement Based on Nave Bayes Model[J]. Journal of Computer Research and Development, 2015, 52(7): 1499-1509. DOI: 10.7544/issn1000-1239.2015.20140383
    Citation: Wang Junhua, Zuo Wanli, Yan Zhao. Word Semantic Similarity Measurement Based on Nave Bayes Model[J]. Journal of Computer Research and Development, 2015, 52(7): 1499-1509. DOI: 10.7544/issn1000-1239.2015.20140383

    Word Semantic Similarity Measurement Based on Nave Bayes Model

    • Measuring semantic similarity between words is a classical and hot problem in nature language processing, the achievement of which has great impact on many applications such as word sense disambiguation, machine translation, ontology mapping, computational linguistics, etc. A novel approach is proposed to measure words semantic similarity by combining Nave Bayes model with knowledge base. To start, extract attribute variables based on WordNet; then, generate conditional probability distribution by statistics and piecewise linear interpolation technique; after that, obtain posteriori through Bayesian inference; at last, quantify word semantic similarity. The main contributions are definition of distance and depth between word pairs with small amount of computation and high degree of distinguishing the characteristics from words’ sense, and word semantic similarity measurement based on nave Bayesian model. On benchmark data set R&G(65), the experiment is conducted through 5-fold cross validation. The sample Pearson correlation between test results and human judgments is 0.912, with 0.4% improvement over existing best practice, and 7%~13% improvement over classical methods. Spearman correlation between test results and human judgments is 0.873, with 10%~20% improvement over classical methods. And the computational complexity of the method is as efficient as the classical methods, which indicates that integrating Nave Bayes model with knowledge base to measure word semantic similarity is reasonable and effective.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return