Abstract:
This paper systematically explores noun phrase anaphoricity determination for coreference resolution in both English and Chinese languages in various ways. Firstly, a rule-based method is used to detect the non-anaphors which are insensitive to the context or have some obvious patterns. Then, both flat feature-based and structured tree kernel-based methods are used to determinate the non-anaphors sensitive to the context. Finally, a composite kernel is proposed to combine the flat features with structured ones to further improve the performance. Experimental results on both the ACE 2003 English corpus and the ACE 2005 Chinese corpus show that all the proposed methods perform well on anaphoricity determination. In addition, the anaphoricity determination module is applied to coreference resolution systematically. Experimental also results show that proper anaphoricity determination can significantly improve the performance of coreference resolution in both English and Chinese languages.