TR2026-044

TTQ: ACTIVATION-AWARE TEST-TIME QUANTIZA- TION TO ACCELERATE LLM INFERENCE ON THE FLY


    •  Koike-Akino, T., Liu, J., Wang, Y., "TTQ: Activation-Aware Test-Time Quantization to Accelerate LLM Inference on the Fly", International Conference on Learning Representations (ICLR) Workshop, April 2026.
      BibTeX TR2026-044 PDF
      • @inproceedings{Koike-Akino2026apr,
      • author = {Koike-Akino, Toshiaki and Liu, Jing and Wang, Ye},
      • title = {{TTQ: Activation-Aware Test-Time Quantization to Accelerate LLM Inference on the Fly}},
      • booktitle = {International Conference on Learning Representations (ICLR) Workshop},
      • year = 2026,
      • month = apr,
      • url = {https://www.merl.com/publications/TR2026-044}
      • }
  • MERL Contacts:
  • Research Areas:

    Artificial Intelligence, Machine Learning

Abstract:

To tackle the huge computational demand of large foundation models, activation- aware compression techniques without retraining have been introduced. However, since these methods highly rely on calibration data, domain shift issues may arise for unseen downstream tasks. We propose a test-time quantization (TTQ) frame- work which compresses large models on the fly at inference time to resolve this issue. With an efficient online calibration, instant activation-aware quantization can adapt every prompt regardless of the downstream tasks, yet achieving inference speedup. Several experiments demonstrate that TTQ can improve the quantization performance over state-of-the-art baselines

 

  • Related Publication

  •  Koike-Akino, T., Liu, J., Wang, Y., "TTQ: Activation-Aware Test-Time Quantization to Accelerate LLM Inference On The Fly", arXiv, March 2026.
    BibTeX arXiv
    • @article{Koike-Akino2026mar,
    • author = {Koike-Akino, Toshiaki and Liu, Jing and Wang, Ye},
    • title = {{TTQ: Activation-Aware Test-Time Quantization to Accelerate LLM Inference On The Fly}},
    • journal = {arXiv},
    • year = 2026,
    • month = mar,
    • url = {https://arxiv.org/abs/2603.19296}
    • }