• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Zhang Libo, Sun Yihan, Luo Tiejian. Calculate Semantic Similarity Based on Large Scale Knowledge Repository[J]. Journal of Computer Research and Development, 2017, 54(11): 2576-2585. DOI: 10.7544/issn1000-1239.2017.20160578
Citation: Zhang Libo, Sun Yihan, Luo Tiejian. Calculate Semantic Similarity Based on Large Scale Knowledge Repository[J]. Journal of Computer Research and Development, 2017, 54(11): 2576-2585. DOI: 10.7544/issn1000-1239.2017.20160578

Calculate Semantic Similarity Based on Large Scale Knowledge Repository

More Information
  • Published Date: October 31, 2017
  • With the continuous growth of the total of human knowledge, semantic analysis on the basis of the structured big data generated by human is becoming more and more important in the application of the fields such as recommended system and information retrieval. It is a key problem to calculate semantic similarity in these fields. Previous studies acquired certain breakthrough by applying large scale knowledge repository, which was represented by Wikipedia, but the path in Wikipedia didn't be fully utilized. In this paper, we summarize and analyze the previous algorithms for evaluating semantic similarity based on Wikipedia. On this foundation, a bilateral shortest paths algorithm is provided, which can evaluate the similarity between words and texts on the basis of the way human beings think, so that it can take full advantage of the path information in the knowledge repository. We extract the hyperlink structure among nodes, whose granularity is finer than that of articles form Wikipedia, then verify the universal connectivity among Wikipedia and evaluate the average shortest path between any two articles. Besides, the presented algorithm evaluates word similarity and text similarity based on the public dataset respectively, and the result indicates the great effect obtained from our algorithm. In the end of the paper, the advantages and disadvantages of proposed algorithm are summed up, and the way to improve follow-up study is proposed.
  • Related Articles

    [1]Zhou Yuanding, Gao Guopeng, Fang Yaodong, Qin Chuan. Perceptual Authentication Hashing with Image Feature Fusion Based on Window Self-Attention[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202330669
    [2]Chen Xuanting, Ye Junjie, Zu Can, Xu Nuo, Gui Tao, Zhang Qi. Robustness of GPT Large Language Models on Natural Language Processing Tasks[J]. Journal of Computer Research and Development, 2024, 61(5): 1128-1142. DOI: 10.7544/issn1000-1239.202330801
    [3]Zhang Zhenyu, Jiang Yuan. Label Noise Robust Learning Algorithm in Environments Evolving Features[J]. Journal of Computer Research and Development, 2023, 60(8): 1740-1753. DOI: 10.7544/issn1000-1239.202330238
    [4]Liu Biao, Zhang Fangjiao, Wang Wenxin, Xie Kang, Zhang Jianyi. A Byzantine-Robust Federated Learning Algorithm Based on Matrix Mapping[J]. Journal of Computer Research and Development, 2021, 58(11): 2416-2429. DOI: 10.7544/issn1000-1239.2021.20210633
    [5]Qin Chuan, Chang Chin Chen, Guo Cheng. Perceptual Robust Image Hashing Scheme Based on Secret Sharing[J]. Journal of Computer Research and Development, 2012, 49(8): 1690-1698.
    [6]Fan Zhiqiang and Zhao Qinping. A Data-Clustering Based Robust SIFT Feature Matching Method[J]. Journal of Computer Research and Development, 2012, 49(5): 1123-1129.
    [7]Zhao Qiyang and Yin Baolin. On the Luminance Overflow in Spread Spectrum Robust Image Watermarking Schemes[J]. Journal of Computer Research and Development, 2009, 46(10): 1729-1736.
    [8]Wang Xiangyang, Hou Limin, Yang Hongying. A Robust Watermarking Scheme Based on Image Feature and PseudoZernike Moments[J]. Journal of Computer Research and Development, 2008, 45(5): 772-778.
    [9]Jin Jun and Zhang Daoqiang. Semi-Supervised Robust On-Line Clustering Algorithm[J]. Journal of Computer Research and Development, 2008, 45(3): 496-502.
    [10]Liu Yi, Wang Yumin. A Robust Itinerary Protection Based on Mobile Agents[J]. Journal of Computer Research and Development, 2005, 42(12): 2106-2110.

Catalog

    Article views (1283) PDF downloads (752) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return