News & Events

150 MERL Talks found.

Learn about the MERL Seminar Series.

TALK [MERL Seminar Series 2026] Jialong Wu presents talk titled World Models and Human-like Reasoning
Date & Time: Wednesday, March 25, 2026; 11:00 AM
Speaker: Jialong Wu, Tsinghua University
MERL Host: Anoop Cherian
Research Areas: Artificial Intelligence, Computer Vision, Machine Learning
Abstract
- This talk introduces the background and key findings of our recent work, "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models," which answers the question of when and how visual generation enabled by unified multimodal models (UMMs) benefits reasoning. We take a world model perspective, inspired by human cognition. Specifically, humans construct mental models of the world, representing information and knowledge through two complementary channels—verbal and visual—to support reasoning, planning, and decision-making. In contrast, recent advances in large language models (LLMs) and vision–language models (VLMs) largely rely on verbal chain-of-thought reasoning, leveraging primarily symbolic and linguistic world knowledge. Unified multimodal models (UMMs) open a new paradigm by using visual generation for visual world modeling, advancing more human-like reasoning on tasks grounded in the physical world. In this work, we formalize the atomic capabilities of world models and world model-based chain-of-thought reasoning. We highlight the richer informativeness and complementary prior knowledge afforded by visual world modeling, leading to our visual superiority hypothesis for tasks grounded in the physical world. We identify and design tasks that necessitate interleaved visual-verbal CoT reasoning, constructing a new evaluation suite, VisWorld-Eval. Through controlled experiments on BAGEL, we show that interleaved CoT significantly outperforms purely verbal CoT on tasks that favor visual world modeling, strongly supporting our insights.
TALK [MERL Seminar Series 2026] Alex Gu presents talk titled Proving and Improving: Language Models for Theorem Proving and Proof Shortening in Lean
Date & Time: Wednesday, February 11, 2026; 1:00 PM
Speaker: Alex Gu, MIT
MERL Host: Pu (Perry) Wang
Research Areas: Artificial Intelligence, Machine Learning, Optimization
Abstract
- Large language models (LLMs) have made steady progress in formal mathematics, achieving near–International Mathematical Olympiad (IMO) performance. This talk presents two complementary advances toward more capable and interpretable formal proving systems. First, we introduce LeanDojo, a foundational open-source toolkit bridging ML and Lean, enabling large-scale data extraction, interactive training, and the development of ReProver, a retrieval-augmented Lean prover. Next, we turn to a critical challenge: proofs produced by LLMs are often unnecessarily long, redundant, and opaque. To mitigate this, we introduce ProofOptimizer, a system that automatically simplifies Lean proofs while preserving correctness. It combines symbolic linting, a fine-tuned 7B model, and iterative refinement, reducing proof length by up to 87% on MiniF2F and 57% on PutnamBench, even halving some IMO-level proofs. Together, these systems demonstrate how AI can make automated proofs not only possible, but also increasingly comprehensible.
TALK [MERL Seminar Series 2026] Zac Manchester presents talk titled Is locomotion really that hard… and other musings on the virtues of simplicity
Date & Time: Tuesday, January 20, 2026; 12:00 PM
Speaker: Zac Manchester, MIT
MERL Host: Pedro Miraldo
Research Areas: Computer Vision, Control, Optimization, Robotics
Abstract
- For decades, legged locomotion was a challenging research topic in robotics. In the last few years, however, both model-based and reinforcement-learning approaches have not only demonstrated impressive performance in laboratory settings, but are now regularly deployed "in the wild." One surprising feature of these successful controllers is how simple they can be. Meanwhile, Art Bryson’s timeless advice to control engineers, “Be wise – linearize,” seems to be increasingly falling out of fashion and at risk of being forgotten by the next generation of practitioners. This talk will discuss several recent works from my group that try to push the limits of how simple locomotion (and, possibly, manipulation) controllers for general-purpose robots can be from several different viewpoints, while also making connections to state-of-the-art generative AI methods like diffusion policies.
TALK [MERL Seminar Series 2026] Laixi Shi presents talk titled Robust Decision Making Without Compromising Learning Efficiency
Date & Time: Wednesday, January 14, 2026; 1:00 PM
Speaker: Laixi Shi, Johns Hopkins University
MERL Host: Dehong Liu
Research Areas: Artificial Intelligence, Control, Machine Learning
Abstract
- Decision-making artificial intelligence (AI) has revolutionized human life ranging from healthcare, daily life, to scientific discovery. However, current AI systems often lack reliability and are highly vulnerable to small changes in complex, interactive, and dynamic environments. My research focuses on achieving both reliability and learning efficiency simultaneously when building AI solutions. These two goals seem conflicting, as enhancing robustness against variability often leads to more complex problems that requires more data and computational resources, at the cost of learning efficiency. But does it have to?
  
  In this talk, I overview my work on building reliable decision-making AI without sacrificing learning efficiency, offering insights into effective optimization problem design for reliable AI. To begin, I will focus on reinforcement learning (RL) — a key framework for sequential decision-making, and demonstrate how distributional robustness can be achieved provably without paying statistical premium (additional training data cost) compared to non-robust counterparts. Next, shifting to decision-making in strategic multi-agent systems, I will demonstrate that incorporating realistic risk preferences—a key feature of human decision-making—enables computational tractability, a benefit not present in traditional models. Finally, I will present a vision for building reliable, learning-efficient AI solutions for human-centered applications, though agentic and multi-agentic AI systems.
TALK [MERL Seminar Series 2025] Behçet Açıkmeşe presents talk titled Robust Trajectory Planning and Control
Date & Time: Wednesday, June 25, 2025; 12:00 PM
Speaker: Behçet Açıkmeşe, University of Washington
MERL Host: Avishai Weiss
Research Areas: Control, Dynamical Systems, Optimization
Abstract
- Next-generation aerospace systems – from asteroid-mining robots and spacecraft swarms to hypersonic vehicles and urban air mobility – demand autonomy that transcends current limits. These missions require spacecraft to operate safely, eﬃciently, and decisively in unpredictable environments, where every decision must balance performance, resource constraints, and risk. The core challenge lies in solving complex optimal control problems in real time while: i) Exploiting full system capabilities without violating safety limits, ii) Certifying algorithmic reliability for critical Guidance, Navigation, & Control (GN&C) systems, iii) Proving robustness in the presence of uncertainty. Our solution is optimization-based control. By transforming GN&C challenges into structured optimization problems and applying methods of convexification, we achieve provably robust, computationally tractable solutions.
TALK [MERL Seminar Series 2025] Andy Zou presents talk titled Red Teaming AI Agents in-the-wild: Revealing Deployment Vulnerabilities
Date & Time: Wednesday, March 26, 2025; 1:00 PM
Speaker: Andy Zou, CMU & Gray Swan AI
MERL Host: Ye Wang
Research Areas: Artificial Intelligence, Machine Learning, Information Security
Abstract
- This presentation demonstrates how red teaming uncovers critical vulnerabilities in AI agents that challenge assumptions about safe deployment. The talk discusses the risks of integrating AI into real-world applications and recommends practical safeguards to enhance resilience and ensure dependable deployment in high-risk settings.
TALK [MERL Seminar Series 2025] Dick den Hertog presents talk titled Optimizing the Path Towards Plastic-Free Oceans
Date & Time: Tuesday, March 11, 2025; 12:00 PM
Speaker: Dick den Hertog, University of Amsterdam
MERL Host: Arvind Raghunathan
Research Areas: Data Analytics, Optimization
Abstract
- Increasing ocean plastic pollution is irreversibly harming ecosystems and human economic activities. We partner with a nonprofit organization and use optimization to help clean up oceans from plastic faster. Specifically, we optimize the route of their plastic collection system in the ocean to maximize the quantity of plastic collected over time. We formulate the problem as a longest path problem in a well-structured graph. However, because collection directly impacts future plastic density, the corresponding edge lengths are nonlinear polynomials. After analyzing the structural properties of the edge lengths, we propose a search-and-bound method, which leverages a relaxation of the problem solvable via dynamic programming and clustering, to efficiently find high-quality solutions (within 6% optimal in practice) and develop a tailored branch-and-bound strategy to solve it to provable optimality. On one year of ocean data, our optimization-based routing approach increases the quantity of plastic collected by more than 60% compared with the current routing strategy, hence speeding up the progress toward plastic-free oceans.
TALK [MERL Seminar Series 2025] Qing Qu presents talk titled The Emergence of Generalizability and Semantic Low-Dim Subspaces in Diffusion Models
Date & Time: Wednesday, March 5, 2025; 12:00 PM
Speaker: Qing Qu, University of Michigan
MERL Host: Pu (Perry) Wang
Research Areas: Artificial Intelligence, Computational Sensing, Machine Learning, Signal Processing
Abstract
- Recent empirical studies have shown that diffusion models possess a unique reproducibility property, transiting from memorization to generalization as the number of training samples increases. This demonstrates that diffusion models can effectively learn image distributions and generate new samples. Remarkably, these models achieve this even with a small number of training samples, despite the challenge of large image dimensions, effectively circumventing the curse of dimensionality. In this work, we provide theoretical insights into this phenomenon by leveraging two key empirical observations: (i) the low intrinsic dimensionality of image datasets and (ii) the low-rank property of the denoising autoencoder in trained diffusion models. With these setups, we rigorously demonstrate that optimizing the training loss of diffusion models is equivalent to solving the canonical subspace clustering problem across the training samples. This insight has practical implications for training and controlling diffusion models. Specifically, it enables us to precisely characterize the minimal number of samples necessary for accurately learning the low-rank data support, shedding light on the phase transition from memorization to generalization. Additionally, we empirically establish a correspondence between the subspaces and the semantic representations of image data, which enables one-step, transferrable, efficient image editing. Moreover, our results have profound practical implications for training efficiency and model safety, and they also open up numerous intriguing theoretical questions for future research.
TALK [MERL Seminar Series 2025] Petar Veličković presents talk titled Amplifying Human Performance in Combinatorial Competitive Programming
Date & Time: Wednesday, February 26, 2025; 11:00 AM
Speaker: Petar Veličković, Google DeepMind
MERL Host: Anoop Cherian
Research Areas: Artificial Intelligence, Computer Vision, Machine Learning
Abstract
- Recent years have seen a significant surge in complex AI systems for competitive programming, capable of performing at admirable levels against human competitors. While steady progress has been made, the highest percentiles still remain out of reach for these methods on standard competition platforms such as Codeforces. In this talk, I will describe and dive into our recent work, where we focussed on combinatorial competitive programming. In combinatorial challenges, the target is to find as-good-as-possible solutions to otherwise computationally intractable problems, over specific given inputs. We hypothesise that this scenario offers a unique testbed for human-AI synergy, as human programmers can write a backbone of a heuristic solution, after which AI can be used to optimise the scoring function used by the heuristic. We deploy our approach on previous iterations of Hash Code, a global team programming competition inspired by NP-hard software engineering problems at Google, and we leverage FunSearch to evolve our scoring functions. Our evolved solutions significantly improve the attained scores from their baseline, successfully breaking into the top percentile on all previous Hash Code online qualification rounds, and outperforming the top human teams on several. To the best of our knowledge, this is the first known AI-assisted top-tier result in competitive programming.
TALK [MERL Seminar Series 2025] David Lindell presents talk titled Imaging Dynamic Scenes from Seconds to Picoseconds
Date & Time: Wednesday, January 29, 2025; 1:00 PM
Speaker: David Lindell, University of Toronto
MERL Host: Joshua Rapp
Research Areas: Computational Sensing, Computer Vision, Signal Processing
Abstract
- The observed timescales of the universe span from the exasecond scale (~1e18 seconds) down to the zeptosecond scale (~1e-21 seconds). While specialized imaging systems can capture narrow slices of this temporal spectrum in the ultra-fast regime (e.g., nanoseconds to picoseconds; 1e-9 to 1e-12 s), they cannot simultaneously capture both slow (> 1 second) and ultra-fast events (< 1 nanosecond). Further, ultra-fast imaging systems are conventionally limited to single-viewpoint capture, hindering 3D visualization at ultra-fast timescales. In this talk, I discuss (1) new computational algorithms that turn a single-photon detector into an "ultra-wideband" imaging system that captures events from seconds to picoseconds; and (2) a method for neural rendering using multi-viewpoint, ultra-fast videos captured using single-photon detectors. The latter approach enables rendering videos of propagating light from novel viewpoints, observation of viewpoint-dependent changes in light transport predicted by Einstein, recovery of material properties, and accurate 3D reconstruction from multiply scattered light. Finally, I discuss future directions in ultra-wideband imaging.
TALK [MERL Seminar Series 2024] Di Shi presents talk titled AI-assisted Power Grid Dispatch and Control: Optimization, Safety, and Real-world Demonstrations
Date & Time: Wednesday, November 20, 2024; 1:00 PM
Speaker: Di Shi, New Mexico State University
MERL Host: Hongbo Sun
Research Areas: Artificial Intelligence, Data Analytics, Optimization
Abstract
- This presentation delves into the challenges and advancements in optimizing power system operations through Grid Mind, an innovative, data-driven framework designed to enhance the integration of renewable energy sources. Utilizing advanced learning algorithms, Grid Mind excels in strategic resource allocation and control, significantly improving efficiency and reliability in power systems with high renewable energy penetration. The transformative potential of this AI-assisted technology is highlighted through real-world applications, demonstrating its effectiveness in addressing the complexities of modern power systems. In addition, critical safety considerations and practical deployment challenges are explored, emphasizing the need for robust, secure, and adaptable solutions. This talk also discusses the capabilities of Grid Mind as a distributed, learning-based system optimized for edge devices, marking a significant advancement toward sustainable, safe, and efficient power system operations in an era dominated by renewable energy.
TALK [MERL Seminar Series 2024] Samuel Clarke presents talk titled Audio for Object and Spatial Awareness
Date & Time: Wednesday, October 30, 2024; 1:00 PM
Speaker: Samuel Clarke, Stanford University
MERL Host: Gordon Wichern
Research Areas: Artificial Intelligence, Machine Learning, Robotics, Speech & Audio
Abstract
- Acoustic perception is invaluable to humans and robots in understanding objects and events in their environments. These sounds are dependent on properties of the source, the environment, and the receiver. Many humans possess remarkable intuition both to infer key properties of each of these three aspects from a sound and to form expectations of how these different aspects would affect the sound they hear. In order to equip robots and AI agents with similar if not stronger capabilities, our research has taken a two-fold path. First, we collect high-fidelity datasets in both controlled and uncontrolled environments which capture real sounds of objects and rooms. Second, we introduce differentiable physics-based models that can estimate acoustic properties of objects and rooms from minimal amounts of real audio data, then can predict new sounds from these objects and rooms under novel, “unseen” conditions.
TALK [MERL Seminar Series 2024] Zhaojian Li presents talk titled A Multi-Arm Robotic System for Robotic Apple Harvesting
Date & Time: Wednesday, October 2, 2024; 1:00 PM
Speaker: Zhaojian Li, Mivchigan State University
MERL Host: Yebin Wang
Research Areas: Artificial Intelligence, Computer Vision, Control, Robotics
Abstract
- Harvesting labor is the single largest cost in apple production in the U.S. Surging cost and growing shortage of labor has forced the apple industry to seek automated harvesting solutions. Despite considerable progress in recent years, the existing robotic harvesting systems still fall short of performance expectations, lacking robustness and proving inefficient or overly complex for practical commercial deployment. In this talk, I will present the development and evaluation of a new dual-arm robotic apple harvesting system. This work is a result of a continuous collaboration between Michigan State University and U.S. Department of Agriculture.
TALK [MERL Seminar Series 2024] Tom Griffiths presents talk titled Tools from cognitive science to understand the behavior of large language models
Date & Time: Wednesday, September 18, 2024; 1:00 PM
Speaker: Tom Griffiths, Princeton University
Research Areas: Artificial Intelligence, Data Analytics, Machine Learning, Human-Computer Interaction
Abstract
- Large language models have been found to have surprising capabilities, even what have been called “sparks of artificial general intelligence.” However, understanding these models involves some significant challenges: their internal structure is extremely complicated, their training data is often opaque, and getting access to the underlying mechanisms is becoming increasingly difficult. As a consequence, researchers often have to resort to studying these systems based on their behavior. This situation is, of course, one that cognitive scientists are very familiar with — human brains are complicated systems trained on opaque data and typically difficult to study mechanistically. In this talk I will summarize some of the tools of cognitive science that are useful for understanding the behavior of large language models. Specifically, I will talk about how thinking about different levels of analysis (and Bayesian inference) can help us understand some behaviors that don’t seem particularly intelligent, how tasks like similarity judgment can be used to probe internal representations, how axiom violations can reveal interesting mechanisms, and how associations can reveal biases in systems that have been trained to be unbiased.
TALK [MERL Seminar Series 2024] Chuchu Fan presents talk titled Neural Certificates and LLMs in Large-Scale Autonomy Design
Date & Time: Wednesday, May 29, 2024; 12:00 PM
Speaker: Chuchu Fan, MIT
MERL Host: Abraham P. Vinod
Research Areas: Artificial Intelligence, Control, Machine Learning
Abstract
- Learning-enabled control systems have demonstrated impressive empirical performance on challenging control problems in robotics. However, this performance often arrives with the trade-off of diminished transparency and the absence of guarantees regarding the safety and stability of the learned controllers. In recent years, new techniques have emerged to provide these guarantees by learning certificates alongside control policies — these certificates provide concise, data-driven proofs that guarantee the safety and stability of the learned control system. These methods not only allow the user to verify the safety of a learned controller but also provide supervision during training, allowing safety and stability requirements to influence the training process itself. In this talk, we present two exciting updates on neural certificates. In the first work, we explore the use of graph neural networks to learn collision-avoidance certificates that can generalize to unseen and very crowded environments. The second work presents a novel reinforcement learning approach that can produce certificate functions with the policies while addressing the instability issues in the optimization process. Finally, if time permits, I will also talk about my group's recent work using LLM and domain-specific task and motion planners to allow natural language as input for robot planning.
TALK [MERL Seminar Series 2024] Na Li presents talk titled Close the Loop: From Data to Actions in Complex Systems
Date & Time: Wednesday, April 10, 2024; 12:00 PM
Speaker: Na Li, Harvard University
MERL Host: Yebin Wang
Research Areas: Control, Dynamical Systems, Machine Learning
Abstract
- The explosive growth of machine learning and data-driven methodologies have revolutionized numerous fields. Yet, translating these successes to the domain of dynamical, physical systems remains a significant challenge, hindered by the complex and often unpredictable nature of such environments. Closing the loop from data to actions in these systems faces many difficulties, stemming from the need for sample efficiency and computational feasibility amidst intricate dynamics, along with many other requirements such as verifiability, robustness, and safety. In this talk, we bridge this gap by introducing innovative approaches that harness representation-based methods, domain knowledge, and the physical structures of systems. We present a comprehensive framework that integrates these components to develop reinforcement learning and control strategies that are not only tailored for the complexities of physical systems but also achieve efficiency, safety, and robustness with provable performance.
TALK [MERL Seminar Series 2024] Fadel Adib presents talk titled Decoding Hidden Worlds: Unprecedented Sensing and Connectivity for Climate, Robotics, & Smart Environments
Date & Time: Wednesday, April 3, 2024; 12:00 PM
Speaker: Fadel Adib, MIT & Cartesian
MERL Host: Wael H. Ali
Research Areas: Computational Sensing, Dynamical Systems, Signal Processing
Abstract
- This talk will cover a new generation of technologies that can sense, connect, and perceive the physical world in unprecedented ways. These technologies can uncover hidden worlds around us, promising transformative impact on areas spanning climate change monitoring, ocean mapping, healthcare, food security, supply chain, and even extraterrestrial exploration.
  
  The talk will cover four core technologies invented by Prof. Adib and his team. The first is an ocean internet-of-things (IoT) that uses battery-free sensors for climate change monitoring, marine life discovery, and seafood production (aquaculture). The second is a new perception technology that enables robots to sense and manipulate hidden objects. The third is a new augmented reality headset with ``X-ray vision”, which extends human perception beyond line-of-sight. The fourth is a wireless sensing technology that can “see through walls” and monitor people’s vital signs (including their breathing, heart rate, and emotions), enabling smart environments that sense humans requiring any contact with the human body.
  
  The talk will touch on the journey of these technologies from their inception at MIT to international collaborations and startups that are translating them to real-world impact in areas spanning healthcare, climate change, and supply chain.
TALK [MERL Seminar Series 2024] Sanmi Koyejo presents talk titled Are Emergent Abilities of Large Language Models a Mirage?
Date & Time: Wednesday, March 20, 2024; 1:00 PM
Speaker: Sanmi Koyejo, Stanford University
MERL Host: Jing Liu
Research Areas: Artificial Intelligence, Machine Learning
Abstract
- Recent work claims that large language models display emergent abilities, abilities not present in smaller-scale models that are present in larger-scale models. What makes emergent abilities intriguing is two-fold: their sharpness, transitioning seemingly instantaneously from not present to present, and their unpredictability, appearing at seemingly unforeseeable model scales. Here, we present an alternative explanation for emergent abilities: that for a particular task and model family, when analyzing fixed model outputs, emergent abilities appear due to the researcher's choice of metric rather than due to fundamental changes in model behavior with scale. Specifically, nonlinear or discontinuous metrics produce apparent emergent abilities, whereas linear or continuous metrics produce smooth, continuous predictable changes in model performance. We present our alternative explanation in a simple mathematical model. Via the presented analyses, we provide evidence that alleged emergent abilities evaporate with different metrics or with better statistics, and may not be a fundamental property of scaling AI models.
TALK [MERL Seminar Series 2024] Stefanos Nikolaidis presents talk titled Enhancing the Efficiency and Robustness of Human-Robot Interactions
Date & Time: Friday, March 8, 2024; 1:00 PM
Speaker: Stefanos Nikolaidis, University of Southern California
MERL Host: Siddarth Jain
Research Areas: Machine Learning, Robotics, Human-Computer Interaction
Abstract
- While robots have been successfully deployed in factory floors and warehouses, there has been limited progress in having them perform physical tasks with people at home and in the workplace. I aim to bridge the gap between their current performance in human environments and what robots are capable of doing, by making human-robot interactions efficient and robust.
  
  In the first part of my talk, I discuss enhancing the efficiency of human-robot interactions by enabling robot manipulators to infer the preference of a human teammate and proactively assist them in a collaborative task. I show how we can leverage similarities between different users and tasks to learn compact representations of user preferences and use these representations as priors for efficient inference.
  
  In the second part, I talk about enhancing the robustness of human-robot interactions by algorithmically generating diverse and realistic scenarios in simulation that reveal system failures. I propose formulating the problem of algorithmic scenario generation as a quality diversity problem and show how standard quality diversity algorithms can discover surprising and unexpected failure cases. I then discuss the development of a new class of quality diversity algorithms that significantly improve the search of the scenario space and the integration of these algorithms with generative models, which enables the generation of complex and realistic scenarios.
  
  Finally, I conclude the talk with applications in mining operations, collaborative manufacturing and assistive care.
TALK [MERL Seminar Series 2024] Melanie Mitchell presents talk titled "The Debate Over 'Understanding' in AI's Large Language Models"
Date & Time: Tuesday, February 13, 2024; 1:00 PM
Speaker: Melanie Mitchell, Santa Fe Institute
MERL Host: Suhas Lohit
Research Areas: Artificial Intelligence, Computer Vision, Machine Learning, Human-Computer Interaction
Abstract
- I will survey a current, heated debate in the AI research community on whether large pre-trained language models can be said to "understand" language -- and the physical and social situations language encodes -- in any important sense. I will describe arguments that have been made for and against such understanding, and, more generally, will discuss what methods can be used to fairly evaluate understanding and intelligence in AI systems. I will conclude with key questions for the broader sciences of intelligence that have arisen in light of these discussions.
TALK [MERL Seminar Series 2024] Greta Tuckute presents talk titled Computational models of human auditory and language processing
Date & Time: Wednesday, January 31, 2024; 12:00 PM
Speaker: Greta Tuckute, MIT
Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
Abstract
- Advances in machine learning have led to powerful models for audio and language, proficient in tasks like speech recognition and fluent language generation. Beyond their immense utility in engineering applications, these models offer valuable tools for cognitive science and neuroscience. In this talk, I will demonstrate how these artificial neural network models can be used to understand how the human brain processes language. The first part of the talk will cover how audio neural networks serve as computational accounts for brain activity in the auditory cortex. The second part will focus on the use of large language models, such as those in the GPT family, to non-invasively control brain activity in the human language system.
TALK [MERL Seminar Series 2023] Dr. Kristina Monakhova presents talk titled Robust and Physics-informed machine learning for low light imaging
Date & Time: Tuesday, November 28, 2023; 12:00 PM
Speaker: Kristina Monakhova, MIT and Cornell
MERL Host: Joshua Rapp
Research Areas: Computational Sensing, Computer Vision, Machine Learning, Signal Processing
Abstract
- Imaging in low light settings is extremely challenging due to low photon counts, both in photography and in microscopy. In photography, imaging under low light, high gain settings often results in highly structured, non-Gaussian sensor noise that’s hard to characterize or denoise. In this talk, we address this by developing a GAN-tuned physics-based noise model to more accurately represent camera noise at the lowest light, and highest gain settings. Using this noise model, we train a video denoiser using synthetic data and demonstrate photorealistic videography at starlight (submillilux levels of illumination) for the first time.
  
  For multiphoton microscopy, which is a form a scanning microscopy, there’s a trade-off between field of view, phototoxicity, acquisition time, and image quality, often resulting in noisy measurements. While deep learning-based methods have shown compelling denoising performance, can we trust these methods enough for critical scientific and medical applications? In the second part of this talk, I’ll introduce a learned, distribution-free uncertainty quantification technique that can both denoise and predict pixel-wise uncertainty to gauge how much we can trust our denoiser’s performance. Furthermore, we propose to leverage this learned, pixel-wise uncertainty to drive an adaptive acquisition technique that rescans only the most uncertain regions of a sample. With our sample and algorithm-informed adaptive acquisition, we demonstrate a 120X improvement in total scanning time and total light dose for multiphoton microscopy, while successfully recovering fine structures within the sample.
TALK [MERL Seminar Series 2023] Gioele Zardini presents talk titled Co-Design of Complex Systems: From Autonomy to Future Mobility
Date & Time: Tuesday, November 21, 2023; 11:00 AM
Speaker: Gioele Zardini, ETH Zürich and MIT
Research Areas: Control, Dynamical Systems
Abstract
- When designing complex systems, we need to consider multiple trade-offs at various abstraction levels and scales, and choices of single components need to be studied jointly. For instance, the design of future mobility solutions (e.g., autonomous vehicles, micromobility) and the design of the mobility systems they enable are closely coupled. Indeed, knowledge about the intended service of novel mobility solutions would impact their design and deployment process, whilst insights about their technological development could significantly affect transportation management policies. Optimally co-designing sociotechnical systems is a complex task for at least two reasons. On one hand, the co-design of interconnected systems (e.g., large networks of cyber-physical systems) involves the simultaneous choice of components arising from heterogeneous natures (e.g., hardware vs. software parts) and fields, while satisfying systemic constraints and accounting for multiple objectives. On the other hand, components are connected via collaborative and conflicting interactions between different stakeholders (e.g., within an intermodal mobility system). In this talk, I will present a framework to co-design complex systems, leveraging a monotone theory of co-design and tools from game theory. The framework will be instantiated in the task of designing future mobility systems, all the way from the policies that a city can design, to the autonomy of vehicles part of an autonomous mobility-on-demand service. Through various case studies, I will show how the proposed approaches allow one to efficiently answer heterogeneous questions, unifying different modeling techniques and promoting interdisciplinarity, modularity, and compositionality. I will then discuss open challenges for compositional systems design optimization, and present my agenda to tackle them.
TALK [MERL Seminar Series 2023] Prof. Flavio Calmon presents talk titled Multiplicity in Machine Learning
Date & Time: Tuesday, November 7, 2023; 12:00 PM
Speaker: Flavio Calmon, Harvard University
MERL Host: Ye Wang
Research Areas: Artificial Intelligence, Machine Learning
Abstract
- This talk reviews the concept of predictive multiplicity in machine learning. Predictive multiplicity arises when different classifiers achieve similar average performance for a specific learning task yet produce conflicting predictions for individual samples. We discuss a metric called “Rashomon Capacity” for quantifying predictive multiplicity in multi-class classification. We also present recent findings on the multiplicity cost of differentially private training methods and group fairness interventions in machine learning.
  
  This talk is based on work published at ICML'20, NeurIPS'22, ACM FAccT'23, and NeurIPS'23.
TALK [MERL Seminar Series 2023] Dr. Tanmay Gupta presents talk titled Visual Programming - A compositional approach to building General Purpose Vision Systems
Date & Time: Tuesday, October 31, 2023; 2:00 PM
Speaker: Tanmay Gupta, Allen Institute for Artificial Intelligence
MERL Host: Moitreya Chatterjee
Research Areas: Artificial Intelligence, Computer Vision, Machine Learning
Abstract
- Building General Purpose Vision Systems (GPVs) that can perform a huge variety of tasks has been a long-standing goal for the computer vision community. However, end-to-end training of these systems to handle different modalities and tasks has proven to be extremely challenging. In this talk, I will describe a lucrative neuro-symbolic alternative to the common end-to-end learning paradigm called Visual Programming. Visual Programming is a general framework that leverages the code-generation abilities of LLMs, existing neural models, and non-differentiable programs to enable powerful applications. Some of these applications continue to remain elusive for the current generation of end-to-end trained GPVs.