Abstract:
Recent literature in computational terminology has shown an increasing interest in identifying various semantic relations between concept, which are important for large-scale natural language application systems such as question answering (QA), information retrieval (IR), machine translation (MT), and so on. Taking a natural-language-oriented Web answer system, named NL-WAS, as the application background, a novel approach to generate semantic network of concept based on the semi-structural corpus is proposed. According to the characteristic of the corpus, proper document extraction templates are adopted for 4 kinds of relations between concepts, namely, synonymy, hyponymy, hypernymy and parataxis. Moreover, different window sizes are designed to calculate the relative degree between concepts, and then by choosing the threshold through experimental results and switching the role can obtain all kinds of relationships. Finally, using proper rules, the concept semantic network is optimized. Now the proposed algorithm has already been implemented and applied in the natural language-oriented Web answer system. It is shown that the semantic network of concept can improve the result of the question search of NL-WAS system effectively.