Advanced Search
    Cong Peizhuang, Wang Feiyu, Wang Guo’an, Wang Yanshu, Zheng Ce, Yang Tong. History, Present, and Future of Low-Bit Quantization for Large Language Model: A Case Study on Complex-Domain QuantizationJ. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202550788
    Citation: Cong Peizhuang, Wang Feiyu, Wang Guo’an, Wang Yanshu, Zheng Ce, Yang Tong. History, Present, and Future of Low-Bit Quantization for Large Language Model: A Case Study on Complex-Domain QuantizationJ. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202550788

    History, Present, and Future of Low-Bit Quantization for Large Language Model: A Case Study on Complex-Domain Quantization

    • With the exponential growth in the parameter scale of large language models (LLMs), model deployment and inference face severe challenges in terms of memory and computational resources. Quantization, as a core compression technique, significantly reduces storage requirements and computational overhead by lowering the numerical precision of weights and activations. We first review the development of quantization techniques, from classic Int8/4 methods to cutting-edge extreme-low bit algorithms. Then we summarize the technical characteristics and performance evolution of typical methods, identifying a key challenge: traditional real-domain quantization is limited by discretization errors at extreme-low bitrates, making it difficult to break its performance ceiling. To address this limitation, we systematically review the complex-domain quantization series of work. This series introduces a quantization paradigm based on the complex domain, which significantly expands the model’s expressive space by utilizing amplitude and phase as two degrees of freedom in the parameter representation. Moreover, drawing on the classic paradigm in signal processing where stable representations are achieved by applying the Fourier transform and low-pass filtering to time-domain signals, we further propose a technical path that proceeds from real-valued models through complex-domain transformation and complex-domain quantization, ultimately achieving multiplier-free stable inference. Experimental results demonstrate that iFairy outperforms existing extreme-low-bit methods on multiple benchmark datasets and effectively breaks the real-domain performance ceiling. This showcases the potential value of complex-domain quantization in achieving both efficient modeling and performance preservation. Through a systematic analysis of quantization’s evolution and the case study on complex-domain quantization, we reveal development patterns and future trends, providing a reference for the theoretical research and engineering implementation of efficient LLMs.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return