TR2025-075
LatentLLM: Attention-Aware Joint Tensor Compression
-
- "LatentLLM: Attention-Aware Joint Tensor Compression", IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshop, June 2025.BibTeX TR2025-075 PDF
- @inproceedings{Koike-Akino2025jun,
- author = {Koike-Akino, Toshiaki and Chen, Xiangyu and Liu, Jing and Wang, Ye and Wang, Pu and Brand, Matthew},
- title = {{LatentLLM: Attention-Aware Joint Tensor Compression}},
- booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshop},
- year = 2025,
- month = jun,
- url = {https://www.merl.com/publications/TR2025-075}
- }
,
- "LatentLLM: Attention-Aware Joint Tensor Compression", IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshop, June 2025.
-
MERL Contacts:
-
Research Areas:
Abstract:
We propose a new framework to convert a large foundation model such as large language models (LLMs)/large multi- modal models (LMMs) into a reduced-dimension latent structure. Our method uses a global attention-aware joint tensor decomposition to significantly improve the model efficiency. We show the benefit on several benchmark including multi-modal reasoning tasks.
Related Publication
BibTeX arXiv
- @article{Koike-Akino2025may,
- author = {Koike-Akino, Toshiaki and Chen, Xiangyu and Liu, Jing and Wang, Ye and Wang, Pu and Brand, Matthew},
- title = {{LatentLLM: Attention-Aware Joint Tensor Compression}},
- journal = {arXiv},
- year = 2025,
- month = may,
- url = {https://www.arxiv.org/abs/2505.18413}
- }