Abstract:
Entity ranking is a very important step for related entity finding (REF). Although researchers have done many works about “entity ranking based on Wikipedia for REF”, there still exists some issues: the semi-automatic acquirement of target-type, the coarse-grained target-type, the binary judgment of entity-type relevancy and ignoring the effects of stop words in calculation of entity-relation relevancy. This paper designs a framework, which ranks entities through the calculation of a triple-combination (including entity relevancy, entity-type relevancy and entity-relation relevancy) and acquires the best combination-method through the comparisons of experimental results. A novel approach is proposed to calculate the entity-type relevancy. It can automatically acquire the fine-grained target-type and the discriminative rules of its hyponym Wikipedia-categories through inductive learning, and calculate entity-type relevancy through counting the number of categories which meet the discriminative rules. Also, this paper proposes a “cut stop words to rebuild relation”approach to calculate the entity-relation relevancy between candidate entity and source entity. Experiment results demonstrate that the proposed approaches can effectively improve the entity-ranking results and reduce the time consumed in calculating.