高级检索

    基于异构关系网络图的词义消歧研究

    WSD Method Based on Heterogeneous Relation Graph

    • 摘要: 传统的基于知识库的词义消歧方法采用同一种类型知识(语义或共现关系)进行消歧,忽略了不同类型知识之间的互补作用.针对此问题,在传统的网络图词义消歧模型基础上,通过模型重构和对比实验,提出了一种基于异构关系网络图的词义消歧模型.该模型能够把多种类型的词义消歧知识有机融合到同一个网络图中,充分利用了多种知识协同消歧的优势.同时设计并实现了一种基于模拟退火的自动估计各种知识类型关系权重的方法,以最优化各种知识对消歧效果的影响.该方法是一种无监督的词义消歧方法,可以有效克服数据稀疏及知识获取瓶颈等问题.在SemEval-2007上的测试结果表明,该方法的消歧性能优于基线方法和目前参加该项评测的最好系统.

       

      Abstract: As one of the most important problems in natural language processing, word sense disambiguation (WSD) aims to identify the intended meaning (sense) of words in context. Traditional knowledge-based WSD methods usually leverage only one sort of knowledge (semantic or co-occurrence relationships) but ignore the complementarity between different types for disambiguation. To deal with this problem, this paper proposes a novel WSD model using heterogeneous relation graph. Based on the reconstruction of traditional graph-based WSD model, different kinds of knowledge are naturally incorporated. Furthermore, since not all types of knowledge play an equally important role in WSD, an automatic parameter estimation method is designed and implemented to optimize the disambiguation effect by estimating the weight of various kinds of relations. The parameter estimation algorithm is adapted based on simulated annealing algorithm. The proposed WSD model is unsupervised. It can make full use of multi-source knowledge and alleviate the data sparseness and knowledge acquisition problems. The model is evaluated on a standard multilingual Chinese English lexical task (SemEval-2007), and the results indicate that the proposed method could significantly outperform the baseline method. Moreover, the proposed model also performs better than the best participating system in the evaluation.

       

    /

    返回文章
    返回