Abstract:
Attribute-value extraction is an important and challenging task in information extraction, which aims to automatically discover the values of attributes of named entities. In this paper, we focus on extracting these values from Chinese unstructured text. In order to make models easy to compute, current major methods of attribute-value extraction use only local feature. As a result, it may not make full use of global information related to attribute values. We propose a novel approach based on global feature to enhance the performance of attribute-value extraction. Two types of global feature are defined to capture the extra information beyond local feature, which are boundary distribution feature and value-name dependency feature. To our knowledge, this is the first attempt to acquire attribute values utilizing global feature. Then a new perceptron algorithm is proposed that can use all types of global feature. The proposed algorithm can learn the parameters of local feature and global feature simultaneously. Experiments are carried out on different kinds of attributes of some entity categories. Experimental results show that both precision and recall of our proposed approach are significantly higher than CRF model and averaged perceptron with only local feature. The proposed approach has a good generalization capability on open-domain.