Advanced Search
    Li Dongwen, Zhong Zhenyu, Sun Yufei, Shen Junyu, Ma Zizhi, Yu Chuanyue, Zhang Yuzhi. LingLong: A High-Quality Small-Scale Chinese Pre-trained Language Model[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202330844
    Citation: Li Dongwen, Zhong Zhenyu, Sun Yufei, Shen Junyu, Ma Zizhi, Yu Chuanyue, Zhang Yuzhi. LingLong: A High-Quality Small-Scale Chinese Pre-trained Language Model[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202330844

    LingLong: A High-Quality Small-Scale Chinese Pre-trained Language Model

    • In recent years, large-scale autoregressive Chinese pre-trained language models (PLMs) have demonstrated outstanding performance on various natural language processing (NLP) tasks. However, these models are computationally expensive, and their word-based vocabulary poses significant challenges for practical applications. In addition, most of them use only unidirectional context information, which may result in performance degradation on many tasks, especially tasks requiring a nuanced understanding of context. To address these challenges, we introduce LingLong, a high-quality small-scale Chinese pre-trained language model. LingLong stands out due to its modest scale, comprising only 317 million parameters, making it highly deployable and resource-efficient. We tokenize the training corpus with a character-based vocabulary to mitigate the negative impacts of unknown tokens and word segmentation errors. Moreover, we go beyond the conventional unidirectional context by introducing a novel backward model. This model is trained by reversing the input order of the training data. Combining LingLong and its backward version allows for the use of bidirectional information on downstream tasks. Extensive experimental results validate the effectiveness of LingLong across a diverse set of NLP tasks. LingLong outperforms similar-sized Chinese PLMs on six downstream tasks and surpasses popular large-scale Chinese PLMs on four downstream tasks. These findings underscore the versatility and efficiency of LingLong, opening up possibilities for practical applications and advancements in the Chinese natural language processing field.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return