TR2025-111

AWP: Activation-Aware Weight Pruning and Quantization with Projected Gradient Descent

- Liu, J., Koike-Akino, T., Wang, Y., Mansour, H., Brand, M., "AWP: Activation-Aware Weight Pruning and Quantization with Projected Gradient Descent", International Conference on Machine Learning (ICML) workshop, July 2025.
  BibTeX TR2025-111 PDF Presentation
  - @inproceedings{Liu2025jul,
  - author = {{{Liu, Jing and Koike-Akino, Toshiaki and Wang, Ye and Mansour, Hassan and Brand, Matthew}}},
  - title = {{{AWP: Activation-Aware Weight Pruning and Quantization with Projected Gradient Descent}}},
  - booktitle = {International Conference on Machine Learning (ICML) workshop},
  - year = 2025,
  - month = jul,
  - url = {https://www.merl.com/publications/TR2025-111}
  - }
MERL Contacts:
Research Areas:

Artificial Intelligence, Machine Learning

Abstract:

To address the enormous size of Large Language Models (LLMs), model compression methods, such as quantization and pruning, are often deployed, especially on edge devices. In this work, we focus on layer-wise post-training quantization and pruning. Drawing connections between activation-aware weight pruning and sparse ap- proximation problems, and motivated by the success of Iterative Hard Thresholding (IHT), we propose a unified method for Activation-aware Weight pruning and quantization via Projected gradient descent (AWP). Our experiments demonstrate that AWP outperforms state-of-the-art LLM pruning and quantization methods. Theoretical convergence guarantees of the proposed method for pruning are also provided.

Related Publication

Liu, J., Koike-Akino, T., Wang, Y., Mansour, H., Brand, M., "AWP: Activation-Aware Weight Pruning and Quantization with Projected Gradient Descent", arXiv, June 2025.

BibTeX arXiv

@article{Liu2025jun,
author = {Liu, Jing and Koike-Akino, Toshiaki and Wang, Ye and Mansour, Hassan and Brand, Matthew},
title = {{AWP: Activation-Aware Weight Pruning and Quantization with Projected Gradient Descent}},
journal = {arXiv},
year = 2025,
month = jun,
url = {https://arxiv.org/abs/2506.10205}
}

TR2025-111

AWP: Activation-Aware Weight Pruning and Quantization with Projected Gradient Descent

MERL Contacts:

Jing
Liu

Toshiaki
Koike-Akino

Ye
Wang

Hassan
Mansour

Matthew
Brand

Research Areas:

Abstract:

Related Publication

MERL Contacts:

JingLiu

ToshiakiKoike-Akino

YeWang

HassanMansour

MatthewBrand

Research Areas:

Abstract:

Jing
Liu

Toshiaki
Koike-Akino

Ye
Wang

Hassan
Mansour

Matthew
Brand