TR2025-136

In-Context Iterative Policy Improvement for Dynamic Manipulation

- Van der Merwe, M., Jha, D.K., "In-Context Iterative Policy Improvement for Dynamic Manipulation", Conference on Robot Learning (CoRL), September 2025.
  BibTeX TR2025-136 PDF Video
  - @inproceedings{VanderMerwe2025sep,
  - author = {Van der Merwe, Mark and Jha, Devesh K.},
  - title = {{In-Context Iterative Policy Improvement for Dynamic Manipulation}},
  - booktitle = {Conference on Robot Learning (CoRL)},
  - year = 2025,
  - month = sep,
  - url = {https://www.merl.com/publications/TR2025-136}
  - }
Research Areas:

Artificial Intelligence, Robotics

Abstract:

Attention-based architectures trained on internet-scale language data have demonstrated state of the art reasoning ability for various language-based tasks, such as logic problems and textual reasoning. Additionally, these Large
Language Models (LLMs) have exhibited the ability to perform few-shot prediction via in-context learning, in which input-output examples provided in the prompt are generalized to new inputs. This ability furthermore extends beyond standard language tasks, enabling few-shot learning for general patterns. In this work, we consider the application of in-context learning with pre-trained language models for dynamic manipulation. Dynamic manipulation introduces several crucial challenges, including increased dimensionality, complex dynamics, and partial observability. To address this, we take an iterative approach, and formulate our in-context learning problem to predict adjustments to a parametric policy based on previous interactions. We show across several tasks in simulation and on a physical robot that utilizing in-context learning outperforms alternative methods in the low data regime. Video summary of this work and experiments can be found here.

Related Publications

Van der Merwe, M., Jha, D.K., "In-Context Policy Iteration for Dynamic Manipulation", Advances in Neural Information Processing Systems (NeurIPS) Workshop on Embodied World Models for Decision Making, December 2025.

BibTeX TR2025-163 PDF Video

@inproceedings{VanderMerwe2025dec,
author = {Van der Merwe, Mark and Jha, Devesh K.},
title = {{In-Context Policy Iteration for Dynamic Manipulation}},
booktitle = {Advances in Neural Information Processing Systems (NeurIPS) Workshop on Embodied World Models for Decision Making},
year = 2025,
month = dec,
url = {https://www.merl.com/publications/TR2025-163}
}

Van der Merwe, M., Jha, D.K., "In-Context Iterative Policy Improvement for Dynamic Manipulation", arXiv, August 2025.

BibTeX arXiv

@article{VanderMerwe2025aug,
author = {Van der Merwe, Mark and Jha, Devesh K.},
title = {{In-Context Iterative Policy Improvement for Dynamic Manipulation}},
journal = {arXiv},
year = 2025,
month = aug,
url = {https://arxiv.org/abs/2508.15021}
}

Research Areas:

Abstract: