TR2026-096

MIRROR: Multisensory Implicit Rejection-sampled RObotic policy

- Bhaskar, A., Tokekar, P., Di Cairano, S., Schperberg, A., "MIRROR: Multisensory Implicit Rejection-sampled RObotic policy", ICML 2026 Workshop on Structured Probabilistic Inference & Generative Modeling, July 2026.
  BibTeX TR2026-096 PDF
  - @inproceedings{Bhaskar2026jul,
  - author = {Bhaskar, Amisha and Tokekar, Pratap and {Di Cairano}, Stefano and Schperberg, Alexander},
  - title = {{MIRROR: Multisensory Implicit Rejection-sampled RObotic policy}},
  - booktitle = {ICML 2026 Workshop on Structured Probabilistic Inference \& Generative Modeling},
  - year = 2026,
  - month = jul,
  - url = {https://www.merl.com/publications/TR2026-096}
  - }
MERL Contacts:
- Stefano
  Di Cairano
- Alexander
  Schperberg
Research Areas:

Artificial Intelligence, Control, Dynamical Systems, Machine Learning, Robotics

Abstract:

Robotic imitation learning typically requires models that capture multimodal action distributions while operating at real-time control rates and accommodating multiple sensing modalities. Al- though recent generative approaches such as diffusion models, flow matching, and Implicit Maximum Likelihood Estimation (IMLE) have achieved promising results, they often satisfy only a subset of these requirements. To address this, we introduce MIRROR, a single-pass policy based on a batch-global rejection-sampling variant of IMLE. MIRROR couples a temporal multisensory encoder (integrating RGB, Depth, tactile, audio, and proprioception) with a linear-attention generator using a Performer architecture. We demonstrate the efficacy of MIRROR on a di- verse real-world hardware suite, including loco- manipulation using a Unitree GO2 with a 7- DoF arm D1 and tabletop manipulation with a UR5 manipulator. Across challenging physi- cal tasks such as pre-manipulation parking, high- precision insertion, and multi-object pick-and- place, MIRROR outperforms state-of-the-art dif- fusion policies by 10–25% in success rate while maintaining high-frequency (30–50 Hz) closed- loop control. We further validate our approach on large-scale simulation benchmarks, including CALVIN, MetaWorld, and Robomimic. In CALVIN (10% data split), MIRROR improves success rates by ∼25% over diffusion and ∼20% over flow matching, while simultaneously reducing trajectory jerk by 20×–50×. These results position MIRROR as a fast, accurate, and multi- sensory imitation policy that retains multimodal action coverage without the latency of iterative sampling

Related News & Events

NEWS MERL Presents 4 Main Conference Papers and 6 Workshop Papers at ICML 2026
Date: July 6, 2026 - July 11, 2026
Where: COEX, Seoul, South Korea
MERL Contacts: Moitreya Chatterjee; Anoop Cherian; Stefano Di Cairano; Toshiaki Koike-Akino; Christopher R. Laughman; Jing Liu; Suhas Lohit; Kuan-Chuan Peng; Alexander Schperberg; Ye Wang; Gordon Wichern
Research Areas: Artificial Intelligence, Computer Vision, Machine Learning, Signal Processing
Brief
- MERL researchers are proud to present 4 main conference papers and 6 workshop papers at ICML 2026. ICML, taking place from July 6-11 in Seoul, South Korea, is a premier international conference in machine learning.
  
  Main Conference Papers with MERL Authors:
  
  1. Understanding Dynamic Compute Allocation in Recurrent Transformers by Ibraheem Muhammad Moosa, Suhas Lohit, Ye Wang, Moitreya Chatterjee, and Wenpeng Yin.
  
  2. LLawCo: Learning Laws of Cooperation for Modeling Embodied Multi-Agent Behavior by Qinhong Zhou, Chuang Gan, and Anoop Cherian.
  
  3. Memory-Distilled Selection for Noise-Robust Anomaly Detection by Sirojbek Safarov, Jaewoo Park, Yoon G. Jung, Kuan-Chuan Peng, Wonchul Kim, Seongdeok Bang, and Octavia Camps.
  
  4. Partial Ring Scan: Revisiting Scan Order in Vision State Space Models by Yi-Kuan Hsieh, Kuan-Chuan Peng, Xin Li, Ming-Ching Chang, Yu-Chee Tseng, and Jun-Wei Hsieh.
  
  Workshop Papers with MERL Authors:
  
  1. WISE: Weighted Iterative Society-of-Experts for Multimodal Multi-Agent Debate with Probabilistic Consensus by Anoop Cherian, Suhas Lohit, and Kuan-Chuan Peng. (Workshop on Scalable Learning and Optimization for Efficient Multimodal AI Agents (SCALE))
  
  2. MIRROR: Multisensory Implicit Rejection-sampled RObotic policy by Amisha Bhaskar, Pratap Tokekar, Stefano Di Cairano, and Alexander Schperberg. (Workshop on Structured Probabilistic Inference & Generative Modeling)
  
  3. Reinforced Neural Processes: Memory-Efficient Time-Series Forecasting with a World-Feedback-Trained Memory Policy by Nibraas Khan, Gordon Wichern, and Christopher R. Laughman. (Workshop on Reinforcement Learning from World Feedback (RLxF))
  
  4. Connecting Low-Rank Adapters and Policy Stability in GRPO Fine-Tuning by Antonin Rottman, Francesco Tonin, Yongtao Wu, Toshiaki Koike-Akino, and Volkan Cevher. (Workshop on Connecting Low-rank Representations in AI (CoLorAI))
  
  5. EinSort: Sorting is All We Need for Tensorizing LLM by Toshiaki Koike-Akino, Jing Liu, and Ye Wang. (Workshop on Connecting Low-rank Representations in AI (CoLorAI))
  
  6. Temper and Tilt Lead to SLOP: Reward Hacking Mitigation with Inference-Time Alignment by Ye Wang, and Jing Liu, and Toshiaki Koike-Akino. (Workshop on Agents in the Wild: Safety, Security, and Beyond)

MERL Contacts:

StefanoDi Cairano

AlexanderSchperberg

Research Areas:

Abstract:

Stefano
Di Cairano

Alexander
Schperberg