ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2015, Vol. 52 ›› Issue (9): 2114-2122.

### Chinese Zero Anaphora Resolution with Markov Logic

Song Yang1, Wang Houfeng1,2

1. 1(Institute of Computational Linguistics, Peking University, Beijing 100871); 2(Key Laboratory of Computational Linguistics(Peking University), Ministry of Education, Beijing 100871)
• Online:2015-09-01

Abstract: Chinese zero anaphora resolution includes two subtasks: zero pronoun detection and zero anaphora resolution, which are correlated with each other. Zero pronoun detection means to recognize all the zero anaphors in a given text, which mainly include null subject or null object, and exist widely in Chinese, Japanese and Italian. Zero anaphora resolution means to determine the antecedent for each recognized zero anaphor, which has already appeared as a noun, pronoun or common noun phrase before the detected zero anaphora in the previous text. Traditional methods to solve Chinese zero anaphora resolution problem generally employ some common-used learning features to construct independent classifiers for zero pronoun detection and zero anaphora resolution, but it cannot capture association relationship between these two subtasks, e.g. recognized zero anaphora must be resolved or the one to be resolved must be zero anaphora and so on. In our method, these two subtasks are combined into a unified machine learning framework with Markov logic to make joint inference and joint learning. We use local formulas to describe zero pronoun detection and zero anaphora resolution respectively, and use global formulas to represent the association relationship between these two subtasks. We find that joint learning model which makes learning with inference can acquire more effective feature weights than independent learning model which just makes learning without inference. Experimental results on OntoNotes3.0 Chinese dataset show that our joint learning model can achieve better results compared with independent learning model and other baseline methods.

CLC Number: