News & Events

NEWS MERL co-organizes the 2023 Sound Demixing (SDX2023) Challenge and Workshop
Date: January 23, 2023 - November 4, 2023
Where: International Symposium of Music Information Retrieval (ISMR)
MERL Contacts: Jonathan Le Roux; Gordon Wichern
Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
Brief
- MERL Speech & Audio team members Gordon Wichern and Jonathan Le Roux co-organized the 2023 Sound Demixing Challenge along with researchers from Sony, Moises AI, Audioshake, and Meta.
  
  The SDX2023 Challenge was hosted on the AI Crowd platform and had a prize pool of $42,000 distributed to the winning teams across two tracks: Music Demixing and Cinematic Sound Demixing. A unique aspect of this challenge was the ability to test the audio source separation models developed by challenge participants on non-public songs from Sony Music Entertainment Japan for the music demixing track, and movie soundtracks from Sony Pictures for the cinematic sound demixing track. The challenge ran from January 23rd to May 1st, 2023, and had 884 participants distributed across 68 teams submitting 2828 source separation models. The winners will be announced at the SDX2023 Workshop, which will take place as a satellite event at the International Symposium of Music Information Retrieval (ISMR) in Milan, Italy on November 4, 2023.
  
  MERL’s contribution to SDX2023 focused mainly on the cinematic demixing track. In addition to sponsoring the prizes awarded to the winning teams for that track, the baseline system and initial training data were MERL’s Cocktail Fork separation model and Divide and Remaster dataset, respectively. MERL researchers also contributed to a Town Hall kicking off the challenge, co-authored a scientific paper describing the challenge outcomes, and co-organized the SDX2023 Workshop.
NEWS MERL researchers presenting four papers and organizing the VLAR-SMART101 Workshop at ICCV 2023
Date: October 2, 2023 - October 6, 2023
Where: Paris/France
MERL Contacts: Moitreya Chatterjee; Anoop Cherian; Michael J. Jones; Toshiaki Koike-Akino; Suhas Lohit; Tim K. Marks; Pedro Miraldo; Kuan-Chuan Peng; Ye Wang
Research Areas: Artificial Intelligence, Computer Vision, Machine Learning
Brief
- MERL researchers are presenting 4 papers and organizing the VLAR-SMART-101 workshop at the ICCV 2023 conference, which will be held in Paris, France October 2-6. ICCV is one of the most prestigious and competitive international conferences in computer vision. Details are provided below.
  
  1. Conference paper: “Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional Image Synthesis,” by Nithin Gopalakrishnan Nair, Anoop Cherian, Suhas Lohit, Ye Wang, Toshiaki Koike-Akino, Vishal Patel, and Tim K. Marks
  
  Conditional generative models typically demand large annotated training sets to achieve high-quality synthesis. As a result, there has been significant interest in plug-and-play generation, i.e., using a pre-defined model to guide the generative process. In this paper, we introduce Steered Diffusion, a generalized framework for fine-grained photorealistic zero-shot conditional image generation using a diffusion model trained for unconditional generation. The key idea is to steer the image generation of the diffusion model during inference via designing a loss using a pre-trained inverse model that characterizes the conditional task. Our model shows clear qualitative and quantitative improvements over state-of-the-art diffusion-based plug-and-play models, while adding negligible computational cost.
  
  2. Conference paper: "BANSAC: A dynamic BAyesian Network for adaptive SAmple Consensus," by Valter Piedade and Pedro Miraldo
  
  We derive a dynamic Bayesian network that updates individual data points' inlier scores while iterating RANSAC. At each iteration, we apply weighted sampling using the updated scores. Our method works with or without prior data point scorings. In addition, we use the updated inlier/outlier scoring for deriving a new stopping criterion for the RANSAC loop. Our method outperforms the baselines in accuracy while needing less computational time.
  
  3. Conference paper: "Robust Frame-to-Frame Camera Rotation Estimation in Crowded Scenes," by Fabien Delattre, David Dirnfeld, Phat Nguyen, Stephen Scarano, Michael J. Jones, Pedro Miraldo, and Erik Learned-Miller
  
  We present a novel approach to estimating camera rotation in crowded, real-world scenes captured using a handheld monocular video camera. Our method uses a novel generalization of the Hough transform on SO3 to efficiently find the camera rotation most compatible with the optical flow. Because the setting is not addressed well by other data sets, we provide a new dataset and benchmark, with high-accuracy and rigorously annotated ground truth on 17 video sequences. Our method is more accurate by almost 40 percent than the next best method.
  
  4. Workshop paper: "Tensor Factorization for Leveraging Cross-Modal Knowledge in Data-Constrained Infrared Object Detection" by Manish Sharma*, Moitreya Chatterjee*, Kuan-Chuan Peng, Suhas Lohit, and Michael Jones
  
  While state-of-the-art object detection methods for RGB images have reached some level of maturity, the same is not true for Infrared (IR) images. The primary bottleneck towards bridging this gap is the lack of sufficient labeled training data in the IR images. Towards addressing this issue, we present TensorFact, a novel tensor decomposition method which splits the convolution kernels of a CNN into low-rank factor matrices with fewer parameters. This compressed network is first pre-trained on RGB images and then augmented with only a few parameters. This augmented network is then trained on IR images, while freezing the weights trained on RGB. This prevents it from over-fitting, allowing it to generalize better. Experiments show that our method outperforms state-of-the-art.
  
  5. “Vision-and-Language Algorithmic Reasoning (VLAR) Workshop and SMART-101 Challenge” by Anoop Cherian, Kuan-Chuan Peng, Suhas Lohit, Tim K. Marks, Ram Ramrakhya, Honglu Zhou, Kevin A. Smith, Joanna Matthiesen, and Joshua B. Tenenbaum
  
  MERL researchers along with researchers from MIT, GeorgiaTech, Math Kangaroo USA, and Rutgers University are jointly organizing a workshop on vision-and-language algorithmic reasoning at ICCV 2023 and conducting a challenge based on the SMART-101 puzzles described in the paper: Are Deep Neural Networks SMARTer than Second Graders?. A focus of this workshop is to bring together outstanding faculty/researchers working at the intersections of vision, language, and cognition to provide their opinions on the recent breakthroughs in large language models and artificial general intelligence, as well as showcase their cutting edge research that could inspire the audience to search for the missing pieces in our quest towards solving the puzzle of artificial intelligence.
  
  Workshop link: https://wvlar.github.io/iccv23/
TALK [MERL Seminar Series 2023] Prof. Komei Sugiura presents talk titled The Confluence of Vision, Language, and Robotics
Date & Time: Thursday, September 28, 2023; 12:00 PM
Speaker: Komei Sugiura, Keio University
MERL Host: Chiori Hori
Research Areas: Artificial Intelligence, Machine Learning, Robotics, Speech & Audio
Abstract
- Recent advances in multimodal models that fuse vision and language are revolutionizing robotics. In this lecture, I will begin by introducing recent multimodal foundational models and their applications in robotics. The second topic of this talk will address our recent work on multimodal language processing in robotics. The shortage of home care workers has become a pressing societal issue, and the use of domestic service robots (DSRs) to assist individuals with disabilities is seen as a possible solution. I will present our work on DSRs that are capable of open-vocabulary mobile manipulation, referring expression comprehension and segmentation models for everyday objects, and future captioning methods for cooking videos and DSRs.
TALK [MERL Seminar Series 2023] Prof. Faruque Hasan presents talk titled A Process Systems Engineering Perspective on Carbon Capture: Key Challenges and Opportunities
Date & Time: Tuesday, September 19, 2023; 1:00 PM
Speaker: Faruque Hasan, Texas A&M University
MERL Host: Scott A. Bortoff
Research Areas: Applied Physics, Machine Learning, Multi-Physical Modeling, Optimization
Abstract
- Carbon capture, utilization, and storage (CCUS) is a promising pathway to decarbonize fossil-based power and industrial sectors and is a bridging technology for a sustainable transition to a net-zero emission energy future. This talk aims to provide an overview of design and optimization of CCUS systems. I will also attempt to give a brief perspective on emerging interests in process systems engineering research (e.g., systems integration, multiscale modeling, strategic planning, and optimization under uncertainty). The purpose is not to cover all aspects of PSE research for CCUS but rather to foster discussion by presenting some plausible future directions and ideas.
AWARD Joint University of Padua-MERL team wins Challenge 'AI Olympics With RealAIGym'
Date: August 25, 2023
Awarded to: Alberto Dalla Libera, Niccolo' Turcato, Giulio Giacomuzzo, Ruggero Carli, Diego Romeres
MERL Contact: Diego Romeres
Research Areas: Artificial Intelligence, Machine Learning, Robotics
Brief
- A joint team consisting of members of University of Padua and MERL ranked 1st in the IJCAI2023 Challenge "Al Olympics With RealAlGym: Is Al Ready for Athletic Intelligence in the Real World?". The team was composed by MERL researcher Diego Romeres and a team from University Padua (UniPD) consisting of Alberto Dalla Libera, Ph.D., Ph.D. Candidates: Niccolò Turcato, Giulio Giacomuzzo and Prof. Ruggero Carli from University of Padua.
  
  The International Joint Conference on Artificial Intelligence (IJCAI) is a premier gathering for AI researchers and organizes several competitions. This year the competition CC7 "AI Olympics With RealAIGym: Is AI Ready for Athletic Intelligence in the Real World?" consisted of two stages: simulation and real-robot experiments on two under-actuated robotic systems. The two robotics systems were treated as separate tracks and one final winner was selected for each track based on specific performance criteria in the control tasks.
  
  The UniPD-MERL team competed and won in both tracks. The team's system made strong use of a Model-based Reinforcement Learning algorithm called (MC-PILCO) that we recently published in the journal IEEE Transaction on Robotics.
NEWS MERL presents 9 papers at 2023 IFAC World Congress
Date: July 9, 2023 - July 14, 2023
MERL Contacts: Karl Berntorp; Scott A. Bortoff; Ankush Chakrabarty; Stefano Di Cairano; Christopher R. Laughman; Diego Romeres; Abraham P. Vinod
Research Areas: Control, Dynamical Systems, Machine Learning, Multi-Physical Modeling, Optimization, Robotics
Brief
- MERL researchers presented 9 papers and organized 2 invited/workshop sessions at the 2023 IFAC World Congress held in Yokohama, JP.
  
  MERL's contributions covered topics including decision-making for autonomous vehicles, statistical and learning-based estimation for GNSS and energy systems, impedance control for delta robots, learning for system identification of rigid body dynamics and time-varying systems, and meta-learning for deep state-space modeling using data from similar systems. The invited session (MERL co-organizer: Ankush Chakrabarty) was on the topic of “Estimation and observer design: theory and applications” and the workshop (MERL co-organizer: Karl Berntorp) was on “Gaussian Process Learning for Systems and Control”.
NEWS MERL researchers present 3 papers on Dexterous Manipulation at RSS 23.
Date: July 11, 2023
Where: Daegu, Korea
MERL Contacts: Siddarth Jain; Devesh K. Jha; Arvind Raghunathan
Research Areas: Artificial Intelligence, Machine Learning, Robotics
Brief
- MERL researchers presented 3 papers at the 19th edition of Robotics:Science and Systems Conference in Daegu, Korea. RSS is the flagship conference of the RSS foundation and is run as a single track conference presenting a limited number of high-quality papers. This year the main conference had a total of 112 papers presented. MERL researchers presented 2 papers in the main conference on planning and perception for dexterous manipulation. Another paper was presented in a workshop of learning for dexterous manipulation. More details can be found here https://roboticsconference.org.
NEWS Keynote address given by Philip Orlik at 9th annual IEEE Smartcomp conference
Date: June 26, 2023
Where: International Conference on Smart Computing (SMARTCOMP), Vanderbilt University, Nashville, Tennessee
MERL Contact: Philip V. Orlik
Research Areas: Communications, Dynamical Systems, Machine Learning, Multi-Physical Modeling, Signal Processing
Brief
- VP & Research Director, Philip Orlik, gave a keynote titled, "Smart Technologies for Smarter Buildings" at the 9th edition of the IEEE International Conference on Smart Computing (SMARTCOMP) focusing on some of the research challenges and opportunities that arise as we seek to achieve net-zero emissions in Smart building environments.
  
  SMARTCOMP is the premier conference on smart computing. Smart computing is a multidisciplinary domain based on the synergistic influence of advances in sensor-based technologies, Internet of Things, cyber-physical systems, edge computing, big data analytics, machine learning, cognitive computing, and artificial intelligence.
AWARD MERL Intern and Researchers Win ICASSP 2023 Best Student Paper Award
Date: June 9, 2023
Awarded to: Darius Petermann, Gordon Wichern, Aswin Subramanian, Jonathan Le Roux
MERL Contacts: Jonathan Le Roux; Gordon Wichern
Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
Brief
- Former MERL intern Darius Petermann (Ph.D. Candidate at Indiana University) has received a Best Student Paper Award at the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023) for the paper "Hyperbolic Audio Source Separation", co-authored with MERL researchers Gordon Wichern and Jonathan Le Roux, and former MERL researcher Aswin Subramanian. The paper presents work performed during Darius's internship at MERL in the summer 2022. The paper introduces a framework for audio source separation using embeddings on a hyperbolic manifold that compactly represent the hierarchical relationship between sound sources and time-frequency features. Additionally, the code associated with the paper is publicly available at https://github.com/merlresearch/hyper-unmix.
  
  ICASSP is the flagship conference of the IEEE Signal Processing Society (SPS). ICASSP 2023 was held in the Greek island of Rhodes from June 04 to June 10, 2023, and it was the largest ICASSP in history, with more than 4000 participants, over 6128 submitted papers and 2709 accepted papers. Darius’s paper was first recognized as one of the Top 3% of all papers accepted at the conference, before receiving one of only 5 Best Student Paper Awards during the closing ceremony.
AWARD MERL’s Paper on Wi-Fi Sensing Earns Top 3% Paper Recognition at ICASSP 2023, Selected as a Best Student Paper Award Finalist
Date: June 9, 2023
Awarded to: Cristian J. Vaca-Rubio, Pu Wang, Toshiaki Koike-Akino, Ye Wang, Petros Boufounos and Petar Popovski
MERL Contacts: Petros T. Boufounos; Toshiaki Koike-Akino; Pu (Perry) Wang; Ye Wang
Research Areas: Artificial Intelligence, Communications, Computational Sensing, Dynamical Systems, Machine Learning, Signal Processing
Brief
- A MERL Paper on Wi-Fi sensing was recognized as a Top 3% Paper among all 2709 accepted papers at the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023). Co-authored by Cristian Vaca-Rubio and Petar Popovski from Aalborg University, Denmark, and MERL researchers Pu Wang, Toshiaki Koike-Akino, Ye Wang, and Petros Boufounos, the paper "MmWave Wi-Fi Trajectory Estimation with Continous-Time Neural Dynamic Learning" was also a Best Student Paper Award finalist.
  
  Performed during Cristian’s stay at MERL first as a visiting Marie Skłodowska-Curie Fellow and then as a full-time intern in 2022, this work capitalizes on standards-compliant Wi-Fi signals to perform indoor localization and sensing. The paper uses a neural dynamic learning framework to address technical issues such as low sampling rate and irregular sampling intervals.
  
  ICASSP, a flagship conference of the IEEE Signal Processing Society (SPS), was hosted on the Greek island of Rhodes from June 04 to June 10, 2023. ICASSP 2023 marked the largest ICASSP in history, boasting over 4000 participants and 6128 submitted papers, out of which 2709 were accepted.
AWARD Joint CMU-MERL team wins DCASE2023 Challenge on Automated Audio Captioning
Date: June 1, 2023
Awarded to: Shih-Lun Wu, Xuankai Chang, Gordon Wichern, Jee-weon Jung, Francois Germain, Jonathan Le Roux, Shinji Watanabe
MERL Contacts: François Germain; Jonathan Le Roux; Gordon Wichern
Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
Brief
- A joint team consisting of members of CMU Professor and MERL Alumn Shinji Watanabe's WavLab and members of MERL's Speech & Audio team ranked 1st out of 11 teams in the DCASE2023 Challenge's Task 6A "Automated Audio Captioning". The team was led by student Shih-Lun Wu and also featured Ph.D. candidate Xuankai Chang, Postdoctoral research associate Jee-weon Jung, Prof. Shinji Watanabe, and MERL researchers Gordon Wichern, Francois Germain, and Jonathan Le Roux.
  
  The IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE Challenge), started in 2013, has been organized yearly since 2016, and gathers challenges on multiple tasks related to the detection, analysis, and generation of sound events. This year, the DCASE2023 Challenge received over 428 submissions from 123 teams across seven tasks.
  
  The CMU-MERL team competed in the Task 6A track, Automated Audio Captioning, which aims at generating informative descriptions for various sounds from nature and/or human activities. The team's system made strong use of large pretrained models, namely a BEATs transformer as part of the audio encoder stack, an Instructor Transformer encoding ground-truth captions to derive an audio-text contrastive loss on the audio encoder, and ChatGPT to produce caption mix-ups (i.e., grammatical and compact combinations of two captions) which, together with the corresponding audio mixtures, increase not only the amount but also the complexity and diversity of the training data. The team's best submission obtained a SPIDEr-FL score of 0.327 on the hidden test set, largely outperforming the 2nd best team's 0.315.
NEWS MERL researchers presenting four papers and co-organizing a workshop at CVPR 2023
Date: June 18, 2023 - June 22, 2023
Where: Vancouver/Canada
MERL Contacts: Anoop Cherian; Michael J. Jones; Suhas Lohit; Kuan-Chuan Peng
Research Areas: Artificial Intelligence, Computer Vision, Machine Learning
Brief
- MERL researchers are presenting 4 papers and co-organizing a workshop at the CVPR 2023 conference, which will be held in Vancouver, Canada June 18-22. CVPR is one of the most prestigious and competitive international conferences in computer vision. Details are provided below.
  
  1. “Are Deep Neural Networks SMARTer than Second Graders,” by Anoop Cherian, Kuan-Chuan Peng, Suhas Lohit, Kevin Smith, and Joshua B. Tenenbaum
  
  We present SMART: a Simple Multimodal Algorithmic Reasoning Task and the associated SMART-101 dataset for evaluating the abstraction, deduction, and generalization abilities of neural networks in solving visuo-linguistic puzzles designed for children in the 6-8 age group. Our experiments using SMART-101 reveal that powerful deep models are not better than random accuracy when analyzed for generalization. We also evaluate large language models (including ChatGPT) on a subset of SMART-101 and find that while these models show convincing reasoning abilities, their answers are often incorrect.
  
  Paper: https://arxiv.org/abs/2212.09993
  
  2. “EVAL: Explainable Video Anomaly Localization,” by Ashish Singh, Michael J. Jones, and Erik Learned-Miller
  
  This work presents a method for detecting unusual activities in videos by building a high-level model of activities found in nominal videos of a scene. The high-level features used in the model are human understandable and include attributes such as the object class and the directions and speeds of motion. Such high-level features allow our method to not only detect anomalous activity but also to provide explanations for why it is anomalous.
  
  Paper: https://arxiv.org/abs/2212.07900
  
  3. "Aligning Step-by-Step Instructional Diagrams to Video Demonstrations," by Jiahao Zhang, Anoop Cherian, Yanbin Liu, Yizhak Ben-Shabat, Cristian Rodriguez, and Stephen Gould
  
  The rise of do-it-yourself (DIY) videos on the web has made it possible even for an unskilled person (or a skilled robot) to imitate and follow instructions to complete complex real world tasks. In this paper, we consider the novel problem of aligning instruction steps that are depicted as assembly diagrams (commonly seen in Ikea assembly manuals) with video segments from in-the-wild videos. We present a new dataset: Ikea Assembly in the Wild (IAW) and propose a contrastive learning framework for aligning instruction diagrams with video clips.
  
  Paper: https://arxiv.org/pdf/2303.13800.pdf
  
  4. "HaLP: Hallucinating Latent Positives for Skeleton-Based Self-Supervised Learning of Actions," by Anshul Shah, Aniket Roy, Ketul Shah, Shlok Kumar Mishra, David Jacobs, Anoop Cherian, and Rama Chellappa
  
  In this work, we propose a new contrastive learning approach to train models for skeleton-based action recognition without labels. Our key contribution is a simple module, HaLP: Hallucinating Latent Positives for contrastive learning. HaLP explores the latent space of poses in suitable directions to generate new positives. Our experiments using HaLP demonstrates strong empirical improvements.
  
  Paper: https://arxiv.org/abs/2304.00387
  
  The 4th Workshop on Fair, Data-Efficient, and Trusted Computer Vision
  
  MERL researcher Kuan-Chuan Peng is co-organizing the fourth Workshop on Fair, Data-Efficient, and Trusted Computer Vision (https://fadetrcv.github.io/2023/) in conjunction with CVPR 2023 on June 18, 2023. This workshop provides a focused venue for discussing and disseminating research in the areas of fairness, bias, and trust in computer vision, as well as adjacent domains such as computational social science and public policy.
NEWS Mitsubishi Electric Corporation Press Release Announces Worlds First GaN Power Amplifier Capable of Wideband Operation for 4G, 5G and Beyond 5G/6G.
Date: June 8, 2023
MERL Contacts: Toshiaki Koike-Akino; Koon Hoo Teo
Research Areas: Communications, Electronic and Photonic Devices, Machine Learning, Signal Processing
Brief
- Mitsubishi Electric Corporation announced today it has developed what is believed to be the world's first gallium nitride (GaN) power amplifier that achieves a frequency range of 3,400MHz using a single power amplifier, which the company has demonstrated can be used for 4G, 5G and Beyond 5G/6G communication systems operating at different frequencies in a single base station. The amplifier is expected to enable the radio unit (transceiver) to be shared with different communication systems and lead to more power-efficient base stations.
  
  Mitsubishi Electric Researchers, Toshiaki Koike-Akino and Koon Hoo Teo helped developed the technology and device. Technical details will be presented at the IEEE International Microwave Symposium 2023 this month.
  
  Please see the link below for the full press release from Mitsubishi Electric.
NEWS Ankush Chakrabarty co-organized three sessions at the ACC2023, and was nominated for Best Energy Systems Paper.
Date: June 30, 2023 - June 2, 2023
Where: San Diego, CA
MERL Contact: Ankush Chakrabarty
Research Areas: Applied Physics, Artificial Intelligence, Control, Data Analytics, Dynamical Systems, Machine Learning, Multi-Physical Modeling, Optimization, Robotics
Brief
- Ankush Chakrabarty (researcher, Multiphysical Systems Team) co-organized and spoke at 3 sessions at the 2023 American Control Conference in San Diego, CA. These include: (1) A tutorial session (w/ Stefano Di Cairano) on "Physics Informed Machine Learning for Modeling and Control": an effort with contributions from multiple academic institutes and US research labs; (2) An invited session on "Energy Efficiency in Smart Buildings and Cities" in which his paper (w/ Chris Laughman) on "Local Search Region Constrained Bayesian Optimization for Performance Optimization of Vapor Compression Systems" was nominated for Best Energy Systems Paper Award; and, (3) A special session on Diversity, Equity, and Inclusion to improve recruitment and retention of underrepresented groups in STEM research.
NEWS MERL researchers present 10 papers at the American Control Conference (ACC)
Date: May 31, 2023 - June 2, 2023
Where: San Diego, CA
MERL Contacts: Karl Berntorp; Ankush Chakrabarty; Vedang M. Deshpande; Stefano Di Cairano; Devesh K. Jha; Christopher R. Laughman; Arvind Raghunathan; Diego Romeres; Abraham P. Vinod; Yebin Wang; Avishai Weiss
Research Areas: Control, Machine Learning, Optimization
Brief
- MERL will present 10 papers at the American Control Conference (ACC) in San Diego, CA, with topics including autonomous-vehicle decision making and control, physics-informed machine learning, motion planning, control subject to nonconvex chance constraints, and optimal power management. Two talks are part of tutorial sessions.
  MERL will also be present at the conference as a sponsor, with a booth for discussing with researchers and students, and hosting a special session at lunch with highlights of MERL research and work philosophy.
NEWS MERL Researchers Present Thirteen Papers at the 2023 IEEE International Conference on Robotics and Automation (ICRA)
Date: May 29, 2023 - June 2, 2023
Where: 2023 IEEE International Conference on Robotics and Automation (ICRA)
MERL Contacts: Anoop Cherian; Radu Corcodel; Siddarth Jain; Devesh K. Jha; Toshiaki Koike-Akino; Tim K. Marks; Daniel N. Nikovski; Arvind Raghunathan; Diego Romeres
Research Areas: Computer Vision, Machine Learning, Optimization, Robotics
Brief
- MERL researchers will present thirteen papers, including eight main conference papers and five workshop papers, at the 2023 IEEE International Conference on Robotics and Automation (ICRA) to be held in London, UK from May 29 to June 2. ICRA is one of the largest and most prestigious conferences in the robotics community. The papers cover a broad set of topics in Robotics including estimation, manipulation, vision-based object recognition and segmentation, tactile estimation and tool manipulation, robotic food handling, robot skill learning, and model-based reinforcement learning.
  
  In addition to the paper presentations, MERL robotics researchers will also host an exhibition booth and look forward to discussing our research with visitors.
NEWS MERL researchers presented four papers and organized a special session at The 14th IEEE International Electric Machines and Drives Conference
Date: May 15, 2023 - May 18, 2023
Where: San Francisco, CA
MERL Contacts: Dehong Liu; Bingnan Wang
Research Areas: Applied Physics, Control, Electric Systems, Machine Learning, Optimization, Signal Processing
Brief
- MERL researchers Yusuke Sakamoto, Anantaram Varatharajan, and
  Bingnan Wang presented four papers at IEMDC 2023 held May 15-18 in San Francisco, CA. The topics of the four oral presentations range from electric machine design optimization, to fault detection and sensorless control. Bingnan Wang organized a special session at the conference entitled: Learning-based Electric Machine Design and Optimization. Bingnan Wang and Yusuke Sakamoto together chaired the special session, as well as a session on: Condition Monitoring, Fault Diagnosis and Prognosis.
  
  The 14th IEEE International Electric Machines and Drives Conference: IEMDC 2023, is one of the major conferences in the area of electric machines and drives. The conference was established in 1997 and has taken place every two years thereafter.
EVENT MERL Contributes to ICASSP 2023
Date: Sunday, June 4, 2023 - Saturday, June 10, 2023
Location: Rhodes Island, Greece
MERL Contacts: Petros T. Boufounos; François Germain; Toshiaki Koike-Akino; Jonathan Le Roux; Dehong Liu; Suhas Lohit; Yanting Ma; Hassan Mansour; Joshua Rapp; Anthony Vetro; Pu (Perry) Wang; Gordon Wichern
Research Areas: Artificial Intelligence, Computational Sensing, Machine Learning, Signal Processing, Speech & Audio
Brief
- MERL has made numerous contributions to both the organization and technical program of ICASSP 2023, which is being held in Rhodes Island, Greece from June 4-10, 2023.
  
  Organization
  
  Petros Boufounos is serving as General Co-Chair of the conference this year, where he has been involved in all aspects of conference planning and execution.
  
  Perry Wang is the organizer of a special session on Radar-Assisted Perception (RAP), which will be held on Wednesday, June 7. The session will feature talks on signal processing and deep learning for radar perception, pose estimation, and mutual interference mitigation with speakers from both academia (Carnegie Mellon University, Virginia Tech, University of Illinois Urbana-Champaign) and industry (Mitsubishi Electric, Bosch, Waveye).
  
  Anthony Vetro is the co-organizer of the Workshop on Signal Processing for Autonomous Systems (SPAS), which will be held on Monday, June 5, and feature invited talks from leaders in both academia and industry on timely topics related to autonomous systems.
  
  Sponsorship
  
  MERL is proud to be a Silver Patron of the conference and will participate in the student job fair on Thursday, June 8. Please join this session to learn more about employment opportunities at MERL, including openings for research scientists, post-docs, and interns.
  
  MERL is pleased to be the sponsor of two IEEE Awards that will be presented at the conference. We congratulate Prof. Rabab Ward, the recipient of the 2023 IEEE Fourier Award for Signal Processing, and Prof. Alexander Waibel, the recipient of the 2023 IEEE James L. Flanagan Speech and Audio Processing Award.
  
  Technical Program
  
  MERL is presenting 13 papers in the main conference on a wide range of topics including source separation and speech enhancement, radar imaging, depth estimation, motor fault detection, time series recovery, and point clouds. One workshop paper has also been accepted for presentation on self-supervised music source separation.
  
  Perry Wang has been invited to give a keynote talk on Wi-Fi sensing and related standards activities at the Workshop on Integrated Sensing and Communications (ISAC), which will be held on Sunday, June 4.
  
  Additionally, Anthony Vetro will present a Perspective Talk on Physics-Grounded Machine Learning, which is scheduled for Thursday, June 8.
  
  About ICASSP
  
  ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing. The event attracts more than 2000 participants each year.
TALK [MERL Seminar Series 2023] Prof. Dan Stowell presents talk titled Fine-grained wildlife sound recognition: Towards the accuracy of a naturalist
Date & Time: Tuesday, April 25, 2023; 11:00 AM
Speaker: Dan Stowell, Tilburg University / Naturalis Biodiversity Centre
MERL Host: Gordon Wichern
Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
Abstract
- Machine learning can be used to identify animals from their sound. This could be a valuable tool for biodiversity monitoring, and for understanding animal behaviour and communication. But to get there, we need very high accuracy at fine-grained acoustic distinctions across hundreds of categories in diverse conditions. In our group we are studying how to achieve this at continental scale. I will describe aspects of bioacoustic data that challenge even the latest deep learning workflows, and our work to address this. Methods covered include adaptive feature representations, deep embeddings and few-shot learning.
TALK [MERL Seminar Series 2023] Dr. Michael Muehlebach presents talk titled Learning and Dynamical Systems
Date & Time: Tuesday, April 11, 2023; 11:00 AM
Speaker: Michael Muehlebach, Max Planck Institute for Intelligent Systems
Research Areas: Control, Dynamical Systems, Machine Learning, Optimization, Robotics
Abstract
- The talk will be divided into two parts. The first part of the talk introduces a class of first-order methods for constrained optimization that are based on an analogy to non-smooth dynamical systems. The key underlying idea is to express constraints in terms of velocities instead of positions, which has the algorithmic consequence that optimizations over feasible sets at each iteration are replaced with optimizations over local, sparse convex approximations. This results is a simplified suite of algorithms and an expanded range of possible applications in machine learning. In the second part of my talk, I will present a robot learning algorithm for trajectory tracking. The method incorporates prior knowledge about the system dynamics and by optimizing over feedforward actions, the risk of instability during deployment is mitigated. The algorithm will be evaluated on a ping-pong playing robot that is actuated by soft pneumatic muscles.
TALK [MERL Seminar Series 2023] Prof. Zoltan Nagy presents talk titled Investigating Multi-Agent Reinforcement Learning for Grid-Interactive Smart Communities using CityLearn
Date & Time: Wednesday, March 29, 2023; 1:00 PM
Speaker: Zoltan Nagy, The University of Texas at Austin
MERL Host: Ankush Chakrabarty
Research Areas: Control, Machine Learning, Multi-Physical Modeling
Abstract
- The decarbonization of buildings presents new challenges for the reliability of the electrical grid because of the intermittency of renewable energy sources and increase in grid load brought about by end-use electrification. To restore reliability, grid-interactive efficient buildings can provide flexibility services to the grid through demand response. Residential demand response programs are hindered by the need for manual intervention by customers. To maximize the energy flexibility potential of residential buildings, an advanced control architecture is needed. Reinforcement learning is well-suited for the control of flexible resources as it can adapt to unique building characteristics compared to expert systems. Yet, factors hindering the adoption of RL in real-world applications include its large data requirements for training, control security and generalizability. This talk will cover some of our recent work addressing these challenges. We proposed the MERLIN framework and developed a digital twin of a real-world 17-building grid-interactive residential community in CityLearn. We show that 1) independent RL-controllers for batteries improve building and district level KPIs compared to a reference RBC by tailoring their policies to individual buildings, 2) despite unique occupant behaviors, transferring the RL policy of any one of the buildings to other buildings provides comparable performance while reducing the cost of training, 3) training RL-controllers on limited temporal data that does not capture full seasonality in occupant behavior has little effect on performance. Although, the zero-net-energy (ZNE) condition of the buildings could be maintained or worsened because of controlled batteries, KPIs that are typically improved by ZNE condition (electricity price and carbon emissions) are further improved when the batteries are managed by an advanced controller.
TALK [MERL Seminar Series 2023] Dr. Suraj Srinivas presents talk titled Pitfalls and Opportunities in Interpretable Machine Learning
Date & Time: Tuesday, March 14, 2023; 1:00 PM
Speaker: Suraj Srinivas, Harvard University
MERL Host: Suhas Lohit
Research Areas: Artificial Intelligence, Computer Vision, Machine Learning
Abstract
- In this talk, I will discuss our recent research on understanding post-hoc interpretability. I will begin by introducing a characterization of post-hoc interpretability methods as local function approximators, and the implications of this viewpoint, including a no-free-lunch theorem for explanations. Next, we shall challenge the assumption that post-hoc explanations provide information about a model's discriminative capabilities p(y|x) and instead demonstrate that many common methods instead rely on a conditional generative model p(x|y). This observation underscores the importance of being cautious when using such methods in practice. Finally, I will propose to resolve this via regularization of model structure, specifically by training low curvature neural networks, resulting in improved model robustness and stable gradients.
TALK [MERL Seminar Series 2023] Prof. Shaowu Pan presents talk titled Neural Implicit Flow
Date & Time: Wednesday, March 1, 2023; 1:00 PM
Speaker: Shaowu Pan, Rensselaer Polytechnic Institute
MERL Host: Saviz Mowlavi
Research Areas: Computational Sensing, Data Analytics, Machine Learning
Abstract
- High-dimensional spatio-temporal dynamics can often be encoded in a low-dimensional subspace. Engineering applications for modeling, characterization, design, and control of such large-scale systems often rely on dimensionality reduction to make solutions computationally tractable in real-time. Common existing paradigms for dimensionality reduction include linear methods, such as the singular value decomposition (SVD), and nonlinear methods, such as variants of convolutional autoencoders (CAE). However, these encoding techniques lack the ability to efficiently represent the complexity associated with spatio-temporal data, which often requires variable geometry, non-uniform grid resolution, adaptive meshing, and/or parametric dependencies. To resolve these practical engineering challenges, we propose a general framework called Neural Implicit Flow (NIF) that enables a mesh-agnostic, low-rank representation of large-scale, parametric, spatial-temporal data. NIF consists of two modified multilayer perceptrons (MLPs): (i) ShapeNet, which isolates and represents the spatial complexity, and (ii) ParameterNet, which accounts for any other input complexity, including parametric dependencies, time, and sensor measurements. We demonstrate the utility of NIF for parametric surrogate modeling, enabling the interpretable representation and compression of complex spatio-temporal dynamics, efficient many-spatial-query tasks, and improved generalization performance for sparse reconstruction.
TALK Prof. Kevin Lynch presents talk titled Autonomous and Human-Collaborative Robotic Manipulation
Date & Time: Tuesday, February 28, 2023; 12:00 PM
Speaker: Prof. Kevin Lynch, Northwestern University
MERL Host: Diego Romeres
Research Areas: Machine Learning, Robotics
Abstract
- Research at the Center for Robotics and Biosystems at Northwestern University includes bio-inspiration, neuromechanics, human-machine systems, and swarm robotics, among other topics. In this talk I will focus on our work on manipulation, including autonomous in-hand robotic manipulation and safe, intuitive human-collaborative manipulation among one or more humans and a team of mobile manipulators.
NEWS Jonathan Le Roux gives invited talk at CMU's Language Technology Institute Colloquium
Date: December 9, 2022
Where: Pittsburg, PA
MERL Contact: Jonathan Le Roux
Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
Brief
- MERL Senior Principal Research Scientist and Speech and Audio Senior Team Leader, Jonathan Le Roux, was invited by Carnegie Mellon University's Language Technology Institute (LTI) to give an invited talk as part of the LTI Colloquium Series. The LTI Colloquium is a prestigious series of talks given by experts from across the country related to different areas of language technologies. Jonathan's talk, entitled "Towards general and flexible audio source separation", presented an overview of techniques developed at MERL towards the goal of robustly and flexibly decomposing and analyzing an acoustic scene, describing in particular the Speech and Audio Team's efforts to extend MERL's early speech separation and enhancement methods to more challenging environments, and to more general and less supervised scenarios.