高级检索

    省略识别及恢复联合模型研究

    A Joint Model for Ellipsis Identification and Recovery

    • 摘要: 省略现象在对话中十分普遍,它的存在导致了语句成分的缺失.问答系统往往不能正确理解这些缺省的表述,这样就会产生错误的问答结果,所以,省略恢复在问答系统中是十分必要的.省略恢复通常分为零代词类别恢复、零代词指代消解2个步骤,已有工作主要是将二者顺序执行,因此会造成错误的累加.为了克服上述问题,提出了1种零代词类别恢复和零代词指代消解联合模型(joint model)的方法,旨在通过联合模型融合省略恢复的2个步骤,进而提高恢复效果.实验结果表明,相比较已有的方法,引入联合模型后,省略恢复的性能得到了显著的提升.

       

      Abstract: An ellipsis is a gap in a sentence due to the pragmatics conventional use of grammar. Ellipsis is a ubiquitous phenomenon in daily conversation, especially in Chinese. A question-answering (QA) system can hardly automatically understand sentences with ellipsis. As a result, the QA system may produce wrong answer and thus cannot naturally interact with humans. Therefore, it is important to recover these ellipses in order to gain a better QA system. To automatically recover these ellipsis elements, we take the recovery system into two parts: zero anaphora identification and zero anaphora resolution. When connecting these two parts together, previous work always models the two steps separately, which suffers the error accumulation problem. In order to deal with this problem, we propose a joint model method that performs the zero anaphora identification and zero anaphora resolution simultaneously in a unified framework. Besides, we focus on Chinese dialogue text, which is collected from the interview of broadcast. The experimental results show that the proposed joint model method outperforms the state-of-the-art methods significantly.

       

    /

    返回文章
    返回