Abstract:
Phrase equivalence pair is very useful for bilingual lexicography, machine translation and crossing-language information retrieval. In this paper, a new method of phrase alignment is proposed, where translation head-phrase is obtained according to dictionary-based word alignment which is very reliable, and statistical translation boundary is determined based on the translation extending reliability. All candidate translations of source language phrase are extracted by combining translation head-phrase with statistical translation boundary. A linear combination model is applied to evaluate all candidate translations of source language phrase and the most probable one is selected. At the same time, a greedy algorithm is used to eliminate the crossing-conflicts between translation boundaries of source language phrases. Experimental results show that the new method achieves 82.76% at precision, which is better than other approaches in open test.