Advanced Search
    Cui Shiqi, Liu Qun, Meng Yao, Yu Hao, Nishino Fumihito. New Word Detection Based on Large-Scale Corpus[J]. Journal of Computer Research and Development, 2006, 43(5): 927-932.
    Citation: Cui Shiqi, Liu Qun, Meng Yao, Yu Hao, Nishino Fumihito. New Word Detection Based on Large-Scale Corpus[J]. Journal of Computer Research and Development, 2006, 43(5): 927-932.

    New Word Detection Based on Large-Scale Corpus

    • New word detection is a part of unknown word detection. The development of natural languages requires us to detect new words as soon as possible. In this paper, a new approach to detect new words based on large-scale corpus is presented. It first segments the corpus from the Internet with ICTCLAS, and searches for repeated strings, and then designs different filtering mechanisms to separate the true new words from the garbage strings, using rich features of various new word patterns. While getting rid of the garbage strings, three garbage lexicons and a suffix lexicon are used, which are learned by the system, and good results are achieved. Finally, the results of the experiments are discussed, which seem to be promising.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return