- Date: May 19, 2025 - May 23, 2025
Where: IEEE ICRA
MERL Contacts: Stefano Di Cairano; Jianlin Guo; Chiori Hori; Siddarth Jain; Devesh K. Jha; Toshiaki Koike-Akino; Philip V. Orlik; Arvind Raghunathan; Diego Romeres; Yuki Shirai; Abraham P. Vinod; Yebin Wang
Research Areas: Artificial Intelligence, Computer Vision, Control, Dynamical Systems, Machine Learning, Optimization, Robotics, Human-Computer Interaction
Brief - MERL made significant contributions to both the organization and the technical program of the International Conference on Robotics and Automation (ICRA) 2025, which was held in Atlanta, Georgia, USA, from May 19th to May 23rd.
MERL was a Bronze sponsor of the conference, and MERL researchers chaired four sessions in the areas of Manipulation Planning, Human-Robot Collaboration, Diffusion Policy, and Learning for Robot Control.
MERL researchers presented four papers in the main conference on the topics of contact-implicit trajectory optimization, proactive robotic assistance in human-robot collaboration, diffusion policy with human preferences, and dynamic and model learning of robotic manipulators. In addition, five more papers were presented in the workshops: “Structured Learning for Efficient, Reliable, and Transparent Robots,” “Safely Leveraging Vision-Language Foundation Models in Robotics: Challenges and Opportunities,” “Long-term Human Motion Prediction,” and “The Future of Intelligent Manufacturing: From Innovation to Implementation.”
MERL researcher Diego Romeres delivered an invited talk titled “Dexterous Robotics: From Multimodal Sensing to Real-World Physical Interactions.”
MERL also collaborated with the University of Padua on one of the conference’s challenges: the “3rd AI Olympics with RealAIGym” (https://ai-olympics.dfki-bremen.de).
During the conference, MERL researchers received the IEEE Transactions on Automation Science and Engineering Best New Application Paper Award for their paper titled “Smart Actuation for End-Edge Industrial Control Systems.”
About ICRA
The IEEE International Conference on Robotics and Automation (ICRA) is the flagship conference of the IEEE Robotics and Automation Society and the world’s largest and most comprehensive technical conference focused on research advances and the latest technological developments in robotics. The event attracts over 7,000 participants, 143 partners and exhibitors, and receives more than 4,000 paper submissions.
-
- Date: Sunday, April 6, 2025 - Friday, April 11, 2025
Location: Hyderabad, India
MERL Contacts: Wael H. Ali; Petros T. Boufounos; Radu Corcodel; François Germain; Chiori Hori; Siddarth Jain; Devesh K. Jha; Toshiaki Koike-Akino; Jonathan Le Roux; Yanting Ma; Hassan Mansour; Yoshiki Masuyama; Joshua Rapp; Diego Romeres; Anthony Vetro; Pu (Perry) Wang; Gordon Wichern
Research Areas: Artificial Intelligence, Communications, Computational Sensing, Electronic and Photonic Devices, Machine Learning, Robotics, Signal Processing, Speech & Audio
Brief - MERL has made numerous contributions to both the organization and technical program of ICASSP 2025, which is being held in Hyderabad, India from April 6-11, 2025.
Sponsorship
MERL is proud to be a Silver Patron of the conference and will participate in the student job fair on Thursday, April 10. Please join this session to learn more about employment opportunities at MERL, including openings for research scientists, post-docs, and interns.
MERL is pleased to be the sponsor of two IEEE Awards that will be presented at the conference. We congratulate Prof. Björn Erik Ottersten, the recipient of the 2025 IEEE Fourier Award for Signal Processing, and Prof. Shrikanth Narayanan, the recipient of the 2025 IEEE James L. Flanagan Speech and Audio Processing Award. Both awards will be presented in-person at ICASSP by Anthony Vetro, MERL President & CEO.
Technical Program
MERL is presenting 15 papers in the main conference on a wide range of topics including source separation, sound event detection, sound anomaly detection, speaker diarization, music generation, robot action generation from video, indoor airflow imaging, WiFi sensing, Doppler single-photon Lidar, optical coherence tomography, and radar imaging. Another paper on spatial audio will be presented at the Generative Data Augmentation for Real-World Signal Processing Applications (GenDA) Satellite Workshop.
MERL Researchers Petros Boufounos and Hassan Mansour will present a Tutorial on “Computational Methods in Radar Imaging” in the afternoon of Monday, April 7.
Petros Boufounos will also be giving an industry talk on Thursday April 10 at 12pm, on “A Physics-Informed Approach to Sensing".
About ICASSP
ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing. The event has been attracting more than 4000 participants each year.
-
- Date & Time: Wednesday, January 29, 2025; 1:00 PM
Speaker: David Lindell, University of Toronto
MERL Host: Joshua Rapp
Research Areas: Computational Sensing, Computer Vision, Signal Processing
Abstract
The observed timescales of the universe span from the exasecond scale (~1e18 seconds) down to the zeptosecond scale (~1e-21 seconds). While specialized imaging systems can capture narrow slices of this temporal spectrum in the ultra-fast regime (e.g., nanoseconds to picoseconds; 1e-9 to 1e-12 s), they cannot simultaneously capture both slow (> 1 second) and ultra-fast events (< 1 nanosecond). Further, ultra-fast imaging systems are conventionally limited to single-viewpoint capture, hindering 3D visualization at ultra-fast timescales. In this talk, I discuss (1) new computational algorithms that turn a single-photon detector into an "ultra-wideband" imaging system that captures events from seconds to picoseconds; and (2) a method for neural rendering using multi-viewpoint, ultra-fast videos captured using single-photon detectors. The latter approach enables rendering videos of propagating light from novel viewpoints, observation of viewpoint-dependent changes in light transport predicted by Einstein, recovery of material properties, and accurate 3D reconstruction from multiply scattered light. Finally, I discuss future directions in ultra-wideband imaging.
-
- Date: November 14, 2024 - November 22, 2024
Where: Italian Consulate
MERL Contact: Diego Romeres
Research Area: Robotics
Brief - Prof. Zunino from the University of Genoa, with support from MERL Researcher Diego Romeres, organized a robotic workshop that introduced 6th-8th grade students from the greater Boston area to the fundamentals of robotics. The workshop provided students with hands-on experience in robotic technology using LEGO systems. Participants learned key principles of robotics, teamwork, and project planning. They worked collaboratively to design, program using visual-based software, and solve challenges as field engineers.
The workshop event was part of the Festival of Italian Creativity organized by the Italian consulate to honor the naming of Boston as a Capital of Italian Creativity.
-
- Date & Time: Wednesday, October 30, 2024; 1:00 PM
Speaker: Samuel Clarke, Stanford University
MERL Host: Gordon Wichern
Research Areas: Artificial Intelligence, Machine Learning, Robotics, Speech & Audio
Abstract
Acoustic perception is invaluable to humans and robots in understanding objects and events in their environments. These sounds are dependent on properties of the source, the environment, and the receiver. Many humans possess remarkable intuition both to infer key properties of each of these three aspects from a sound and to form expectations of how these different aspects would affect the sound they hear. In order to equip robots and AI agents with similar if not stronger capabilities, our research has taken a two-fold path. First, we collect high-fidelity datasets in both controlled and uncontrolled environments which capture real sounds of objects and rooms. Second, we introduce differentiable physics-based models that can estimate acoustic properties of objects and rooms from minimal amounts of real audio data, then can predict new sounds from these objects and rooms under novel, “unseen” conditions.
-
- Date & Time: Tuesday, November 19, 2024; 1:30-2:10pm
Location: Virtual Event
Speaker: Prof. Na Li, Harvard University Brief - MERL is excited to announce the featured keynote speaker for our Virtual Open House (VOH) 2024: Prof. Na Li from Harvard University.
Our VOH this year will take place on November 19th, 1:00pm - 4:30pm (EST). Prof. Li’s talk is scheduled for 1:30-2:10pm (EST). For details and agenda of the event, please visit: https://merl.com/events/voh24
Join us to learn more about who we are, what we do, and discuss our internship, post-doc, and full-time employment opportunities. To register, go to: https://mailchi.mp/merl/voh24
Title: Representation-based Learning and Control for Dynamical Systems
Abstract: The explosive growth of machine learning and data-driven methodologies have revolutionized numerous fields. Yet, the translation of these successes to the domain of dynamical physical systems remains a significant challenge. Closing the loop from data to actions in these systems faces many difficulties, stemming from the need for sample efficiency and computational feasibility, along with many other requirements such as verifiability, robustness, and safety. In this talk, we bridge this gap by introducing innovative representations to develop nonlinear stochastic control and reinforcement learning methods. Key in the representation is to represent the stochastic, nonlinear dynamics linearly onto a nonlinear feature space. We present a comprehensive framework to develop control and learning strategies which achieve efficiency, safety, robustness, and scalability with provable performance. We also show how the representation could be used to close the sim-to-real gap. Lastly, we will briefly present some concrete real-world applications, discussing how domain knowledge is applied in practice to further close the loop from data to actions.
-
- Date: Thursday, October 17, 2024
Location: Google, Cambridge, MA
MERL Contact: Jonathan Le Roux
Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
Brief - SANE 2024, a one-day event gathering researchers and students in speech and audio from the Northeast of the American continent, was held on Thursday October 17, 2024 at Google, in Cambridge, MA.
It was the 11th edition in the SANE series of workshops, which started in 2012 and is typically held every year alternately in Boston and New York. Since the first edition, the audience has steadily grown, with a new record of 200 participants and 53 posters in 2024.
SANE 2024 featured invited talks by seven leading researchers from the Northeast as well as from the international community: Quan Wang (Google), Greta Tuckute (MIT), Mark Hamilton (MIT), Bhuvana Ramabhadran (Google), Zhiyao Duan (University of Rochester), and Chris Donahue (Carnegie Mellon University). It also featured a lively poster session with 53 posters.
SANE 2024 was co-organized by Jonathan Le Roux (MERL) and John R. Hershey (Google). SANE remained a free event thanks to generous sponsorship by Google and MERL.
Slides and videos of the talks are available from the SANE workshop website.
-
- Date & Time: Tuesday, November 19, 2024; 1:00 - 4:30 EST
Location: Virtual Event Brief - Join us for MERL's Virtual Open House (VOH) 2024 on November 19th. Live sessions will be held from 1:00-4:30pm EST, including an overview of recent activities by our research groups, a featured guest speaker and live interaction with our research staff through the Gather platform. Registered attendees will be able to browse our virtual booths at their convenience and connect with our research staff to learn about employment opportunities, including internship/post-doc openings as well as visiting faculty positions.
For agenda and details of the event, please visit: https://www.merl.com/events/voh24
To register for the VOH, please go to:
https://mailchi.mp/merl/voh24
-
- Date: Friday, July 26, 2024
Location: MERL Offices
Speaker: Dr. Na Li, Harvard University
MERL Contact: Elizabeth Phillips Brief - On July 26th, MERL hosted its annual Women in Science luncheon. This event brings together MERL's female interns and employees to hear about the experiences and careers of women working in research and engineering fields. This year, we were honored to have Dr. Na Li as our guest speaker. Dr. Li, who joined MERL as a visiting faculty this summer, is a Winokur Family Professor of Electrical Engineering and Applied Mathematics at Harvard University. Dr. Li's talk highlighted her personal journey from rural China to Harvard Professor. Her insights and experiences provided invaluable inspiration to everyone in attendance. We appreciate both Dr. Li and all who participated in making this event a success. MERL remains committed to fostering an inclusive environment that supports the growth and development of women in STEM.
#WomenInScience #WomenInSTEM #MERL #Inspiration #Engineering #Research
-
- Date: July 10, 2024 - July 12, 2024
Where: Toronto, Canada
MERL Contacts: Ankush Chakrabarty; Vedang M. Deshpande; Stefano Di Cairano; Christopher R. Laughman; Arvind Raghunathan; Abraham P. Vinod; Yebin Wang; Avishai Weiss
Research Areas: Artificial Intelligence, Control, Dynamical Systems, Machine Learning, Multi-Physical Modeling, Optimization, Robotics
Brief - MERL researchers presented 9 papers at the recently concluded American Control Conference (ACC) 2024 in Toronto, Canada. The papers covered a wide range of topics including data-driven spatial monitoring using heterogenous robots, aircraft approach management near airports, computation fluid dynamics-based motion planning for drones facing winds, trajectory planning for coordinated monitoring using a team of drones and a ground carrier vehicle, ensemble Kalman smoothing-based model predictive control for motion planning for autonomous vehicles, system identification for Lithium-ion batteries, physics-constrained deep Kalman filters for vapor compression systems, switched reference governors for constrained systems, and distributed road-map monitoring using onboard sensors.
As a sponsor of the conference, MERL maintained a booth for open discussions with researchers and students, and hosted a special session to discuss highlights of MERL research and work philosophy.
In addition, Abraham Vinod served as a panelist at the Student Networking Event at the conference. The student networking event provides an opportunity for all interested students to network with professionals working in industry, academia, and national laboratories during a structured event, and encourages their continued participation as the future leaders in the field.
-
- Date: May 22, 2024
MERL Contact: Toshiaki Koike-Akino
Research Areas: Artificial Intelligence, Machine Learning
Brief - Toshiaki Koike-Akino is invited to present a seminar talk at EPFL, Switzerland. The talk, entitled "Post-Deep Learning: Emerging Quantum AI Technology", will discuss the recent trends, challenges, and applications of quantum machine learning (QML) technologies. The seminar is organized by Prof. Volkan Cevher and Prof. Giovanni De Micheli. The event invites students, researchers, scholars and professors through EPFL departments including School of Engineering, Communication Science, Life Science, Machine Learning and AI Center.
-
- Date: Sunday, April 14, 2024 - Friday, April 19, 2024
Location: Seoul, South Korea
MERL Contacts: Petros T. Boufounos; François Germain; Chiori Hori; Toshiaki Koike-Akino; Jonathan Le Roux; Hassan Mansour; Kieran Parsons; Joshua Rapp; Anthony Vetro; Pu (Perry) Wang; Gordon Wichern
Research Areas: Artificial Intelligence, Computational Sensing, Machine Learning, Robotics, Signal Processing, Speech & Audio
Brief - MERL has made numerous contributions to both the organization and technical program of ICASSP 2024, which is being held in Seoul, Korea from April 14-19, 2024.
Sponsorship and Awards
MERL is proud to be a Bronze Patron of the conference and will participate in the student job fair on Thursday, April 18. Please join this session to learn more about employment opportunities at MERL, including openings for research scientists, post-docs, and interns.
MERL is pleased to be the sponsor of two IEEE Awards that will be presented at the conference. We congratulate Prof. Stéphane G. Mallat, the recipient of the 2024 IEEE Fourier Award for Signal Processing, and Prof. Keiichi Tokuda, the recipient of the 2024 IEEE James L. Flanagan Speech and Audio Processing Award.
Jonathan Le Roux, MERL Speech and Audio Senior Team Leader, will also be recognized during the Awards Ceremony for his recent elevation to IEEE Fellow.
Technical Program
MERL will present 13 papers in the main conference on a wide range of topics including automated audio captioning, speech separation, audio generative models, speech and sound synthesis, spatial audio reproduction, multimodal indoor monitoring, radar imaging, depth estimation, physics-informed machine learning, and integrated sensing and communications (ISAC). Three workshop papers have also been accepted for presentation on audio-visual speaker diarization, music source separation, and music generative models.
Perry Wang is the co-organizer of the Workshop on Signal Processing and Machine Learning Advances in Automotive Radars (SPLAR), held on Sunday, April 14. It features keynote talks from leaders in both academia and industry, peer-reviewed workshop papers, and lightning talks from ICASSP regular tracks on signal processing and machine learning for automotive radar and, more generally, radar perception.
Gordon Wichern will present an invited keynote talk on analyzing and interpreting audio deep learning models at the Workshop on Explainable Machine Learning for Speech and Audio (XAI-SA), held on Monday, April 15. He will also appear in a panel discussion on interpretable audio AI at the workshop.
Perry Wang also co-organizes a two-part special session on Next-Generation Wi-Fi Sensing (SS-L9 and SS-L13) which will be held on Thursday afternoon, April 18. The special session includes papers on PHY-layer oriented signal processing and data-driven deep learning advances, and supports upcoming 802.11bf WLAN Sensing Standardization activities.
Petros Boufounos is participating as a mentor in ICASSP’s Micro-Mentoring Experience Program (MiME).
About ICASSP
ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing. The event attracts more than 3000 participants.
-
- Date & Time: Wednesday, November 15, 2023; 3:00-3:40pm (EST)
Location: Virtual Event
Speaker: Prof. Yuejie Chi, Carnegie Mellon University
MERL Contact: Bingnan Wang Brief - MERL is excited to announce the featured keynote speaker for our Virtual Open House 2023: Prof. Yuejie Chi from Carnegie Mellon University.
Our virtual open house this year will take place on November 15, 2023, 1:00pm - 5:30pm (EST). Prof. Chi’s talk is scheduled for 3:00-3:40pm (EST). For details and agenda of the event, please visit: https://merl.com/events/voh23
Join us to learn more about who we are, what we do, and discuss our internship, post-doc, and full-time employment opportunities. To register, go to: https://mailchi.mp/merl/voh23
Title: Sample Complexity of Q-learning: from Single-agent to Federated Learning
Abstract: Q-learning, which seeks to learn the optimal Q-function of a Markov decision process (MDP) in a model-free fashion, lies at the heart of reinforcement learning practices. However, theoretical understandings on its non-asymptotic sample complexity remain unsatisfactory, despite significant recent efforts. In this talk, we first show a tight sample complexity bound of Q-learning in the single-agent setting, together with a matching lower bound to establish its minimax sub-optimality. We then show how federated versions of Q-learning allow collaborative learning using data collected by multiple agents without central sharing, where an importance averaging scheme is introduced to unveil the blessing of heterogeneity.
-
- Date: Thursday, October 26, 2023
Location: New York University, Brooklyn, New York, NY
MERL Contacts: Jonathan Le Roux; Gordon Wichern
Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
Brief - SANE 2023, a one-day event gathering researchers and students in speech and audio from the Northeast of the American continent, was held on Thursday October 26, 2023 at NYU in Brooklyn, New York.
It was the 10th edition in the SANE series of workshops, which started in 2012 and is typically held every year alternately in Boston and New York. Since the first edition, the audience has steadily grown, and SANE 2023 broke SANE 2019's record with 200 participants and 51 posters.
This year's SANE took place in conjunction with the WASPAA workshop, held October 22-25 in upstate New York.
SANE 2023 featured invited talks by seven leading researchers from the Northeast and beyond: Arsha Nagrani (Google), Gaël Richard (Télécom Paris), Gordon Wichern (MERL), Kyunghyun Cho (NYU / Prescient Design), Anna Huang (Google DeepMind / MILA), Wenwu Wang (University of Surrey), and Yuan Gong (MIT). It also featured a lively poster session with 51 posters.
SANE 2023 was co-organized by Jonathan Le Roux (MERL), Juan P. Bello (NYU), and John R. Hershey (Google). SANE remained a free event thanks to generous sponsorship by NYU, MERL, Google, Adobe, Bose, Meta Reality Labs, and Amazon.
Slides and videos of the talks are available from the SANE workshop website.
-
- Date: January 23, 2023 - November 4, 2023
Where: International Symposium of Music Information Retrieval (ISMR)
MERL Contacts: Jonathan Le Roux; Gordon Wichern
Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
Brief - MERL Speech & Audio team members Gordon Wichern and Jonathan Le Roux co-organized the 2023 Sound Demixing Challenge along with researchers from Sony, Moises AI, Audioshake, and Meta.
The SDX2023 Challenge was hosted on the AI Crowd platform and had a prize pool of $42,000 distributed to the winning teams across two tracks: Music Demixing and Cinematic Sound Demixing. A unique aspect of this challenge was the ability to test the audio source separation models developed by challenge participants on non-public songs from Sony Music Entertainment Japan for the music demixing track, and movie soundtracks from Sony Pictures for the cinematic sound demixing track. The challenge ran from January 23rd to May 1st, 2023, and had 884 participants distributed across 68 teams submitting 2828 source separation models. The winners will be announced at the SDX2023 Workshop, which will take place as a satellite event at the International Symposium of Music Information Retrieval (ISMR) in Milan, Italy on November 4, 2023.
MERL’s contribution to SDX2023 focused mainly on the cinematic demixing track. In addition to sponsoring the prizes awarded to the winning teams for that track, the baseline system and initial training data were MERL’s Cocktail Fork separation model and Divide and Remaster dataset, respectively. MERL researchers also contributed to a Town Hall kicking off the challenge, co-authored a scientific paper describing the challenge outcomes, and co-organized the SDX2023 Workshop.
-
- Date & Time: Wednesday, November 15, 2023; 1:00 - 5:30 EST
Location: Virtual Event
MERL Contact: Bingnan Wang Brief - Join us for MERL's Virtual Open House (VOH) 2023 on November 15th. Live sessions will be held from 1:00-5:30pm EST, including an overview of recent activities by our research groups, a featured guest speaker and live interaction with our research staff through the Gather platform. Registered attendees will be able to browse our virtual booths at their convenience and connect with our research staff to learn about engagement opportunities, including internship/post-doc openings as well as visiting faculty positions.
For agenda and details of the event: https://www.merl.com/events/voh23
To register for the VOH, go to:
https://mailchi.mp/merl/voh23
-
- Date: August 30, 2023
Awarded to: Bingnan Wang, Hiroshi Inoue, and Makoto Kanemaru
MERL Contact: Bingnan Wang
Research Areas: Applied Physics, Data Analytics, Multi-Physical Modeling
Brief - MERL and Mitsubishi Electric's paper titled “Motor Eccentricity Fault Detection: Physics-Based and Data-Driven Approaches” was awarded one of three best paper awards at the 14th IEEE International Symposium on Diagnostics for Electric Machines, Power Electronics and Drives (SDEMPED 2023). MERL Senior Principal Research Scientist Bingnan Wang presented the paper and received the award at the symposium. Co-authors of the paper include Mitsubishi Electric researchers Hiroshi Inoue and Makoto Kanemaru.
SDEMPED was established as the only international symposium entirely devoted to the diagnostics of electrical machines, power electronics and drives. It is now a regular biennial event. The 14th version, SDEMPED 2023 was held in Chania, Greece from August 28th to 31st, 2023.
-
- Date: Friday, August 4, 2023
Location: MERL's Offices, 201 Broadway, Cambridge, MA
Speaker: Carole-Jean Wu, PhD, Meta AI / Fair
MERL Contacts: Elizabeth Phillips; Anthony Vetro Brief - MERL hosted its annual Women in Science luncheon. Carole-Jean Wu, PhD, joined our event to lead a talk on Scaling AI Computing Sustainably. She shared key challenges across the many dimensions of AI, on what and how at-scale optimization can help reduce the overall carbon footprint of AI and computing. Dr. Wu is a Research Scientist and Technical Lead Manager at Meta AI / FAIR. Prior to Meta/Facebook, she was an Associate Professor at ASU.
As part of this celebration, MERL will be making a donation to Science Club for Girls in Cambridge, MA.
Science Club for Girls' mission is to foster excitement, confidence, and literacy in science, technology, engineering, and mathematics (STEM) for girls and gender-expansive youth from underrepresented communities by providing free, experiential programs and by maximizing meaningful interactions with women-in-STEM mentors.
https://www.scienceclubforgirls.org
-
- Date: June 1, 2023
Awarded to: Shih-Lun Wu, Xuankai Chang, Gordon Wichern, Jee-weon Jung, Francois Germain, Jonathan Le Roux, Shinji Watanabe
MERL Contacts: François Germain; Jonathan Le Roux; Gordon Wichern
Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
Brief - A joint team consisting of members of CMU Professor and MERL Alumn Shinji Watanabe's WavLab and members of MERL's Speech & Audio team ranked 1st out of 11 teams in the DCASE2023 Challenge's Task 6A "Automated Audio Captioning". The team was led by student Shih-Lun Wu and also featured Ph.D. candidate Xuankai Chang, Postdoctoral research associate Jee-weon Jung, Prof. Shinji Watanabe, and MERL researchers Gordon Wichern, Francois Germain, and Jonathan Le Roux.
The IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE Challenge), started in 2013, has been organized yearly since 2016, and gathers challenges on multiple tasks related to the detection, analysis, and generation of sound events. This year, the DCASE2023 Challenge received over 428 submissions from 123 teams across seven tasks.
The CMU-MERL team competed in the Task 6A track, Automated Audio Captioning, which aims at generating informative descriptions for various sounds from nature and/or human activities. The team's system made strong use of large pretrained models, namely a BEATs transformer as part of the audio encoder stack, an Instructor Transformer encoding ground-truth captions to derive an audio-text contrastive loss on the audio encoder, and ChatGPT to produce caption mix-ups (i.e., grammatical and compact combinations of two captions) which, together with the corresponding audio mixtures, increase not only the amount but also the complexity and diversity of the training data. The team's best submission obtained a SPIDEr-FL score of 0.327 on the hidden test set, largely outperforming the 2nd best team's 0.315.
-
- Date: June 1, 2023
Where: San Diego, CA
MERL Contact: Abraham P. Vinod
Research Areas: Control, Optimization
Brief - The student networking event provides an opportunity for all interested students attending American Control Conference 2023 to receive career advice from professionals working in industry, academia, and national laboratories during a structured event. The event aims to provide an engaging experience to students that illustrates the benefits of involvement in the control community and encourage their continued participation as the future leaders in the field.
-
- Date: Sunday, June 4, 2023 - Saturday, June 10, 2023
Location: Rhodes Island, Greece
MERL Contacts: Petros T. Boufounos; François Germain; Toshiaki Koike-Akino; Jonathan Le Roux; Dehong Liu; Suhas Lohit; Yanting Ma; Hassan Mansour; Joshua Rapp; Anthony Vetro; Pu (Perry) Wang; Gordon Wichern
Research Areas: Artificial Intelligence, Computational Sensing, Machine Learning, Signal Processing, Speech & Audio
Brief - MERL has made numerous contributions to both the organization and technical program of ICASSP 2023, which is being held in Rhodes Island, Greece from June 4-10, 2023.
Organization
Petros Boufounos is serving as General Co-Chair of the conference this year, where he has been involved in all aspects of conference planning and execution.
Perry Wang is the organizer of a special session on Radar-Assisted Perception (RAP), which will be held on Wednesday, June 7. The session will feature talks on signal processing and deep learning for radar perception, pose estimation, and mutual interference mitigation with speakers from both academia (Carnegie Mellon University, Virginia Tech, University of Illinois Urbana-Champaign) and industry (Mitsubishi Electric, Bosch, Waveye).
Anthony Vetro is the co-organizer of the Workshop on Signal Processing for Autonomous Systems (SPAS), which will be held on Monday, June 5, and feature invited talks from leaders in both academia and industry on timely topics related to autonomous systems.
Sponsorship
MERL is proud to be a Silver Patron of the conference and will participate in the student job fair on Thursday, June 8. Please join this session to learn more about employment opportunities at MERL, including openings for research scientists, post-docs, and interns.
MERL is pleased to be the sponsor of two IEEE Awards that will be presented at the conference. We congratulate Prof. Rabab Ward, the recipient of the 2023 IEEE Fourier Award for Signal Processing, and Prof. Alexander Waibel, the recipient of the 2023 IEEE James L. Flanagan Speech and Audio Processing Award.
Technical Program
MERL is presenting 13 papers in the main conference on a wide range of topics including source separation and speech enhancement, radar imaging, depth estimation, motor fault detection, time series recovery, and point clouds. One workshop paper has also been accepted for presentation on self-supervised music source separation.
Perry Wang has been invited to give a keynote talk on Wi-Fi sensing and related standards activities at the Workshop on Integrated Sensing and Communications (ISAC), which will be held on Sunday, June 4.
Additionally, Anthony Vetro will present a Perspective Talk on Physics-Grounded Machine Learning, which is scheduled for Thursday, June 8.
About ICASSP
ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing. The event attracts more than 2000 participants each year.
-
- Date & Time: Tuesday, April 25, 2023; 11:00 AM
Speaker: Dan Stowell, Tilburg University / Naturalis Biodiversity Centre
MERL Host: Gordon Wichern
Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
Abstract
Machine learning can be used to identify animals from their sound. This could be a valuable tool for biodiversity monitoring, and for understanding animal behaviour and communication. But to get there, we need very high accuracy at fine-grained acoustic distinctions across hundreds of categories in diverse conditions. In our group we are studying how to achieve this at continental scale. I will describe aspects of bioacoustic data that challenge even the latest deep learning workflows, and our work to address this. Methods covered include adaptive feature representations, deep embeddings and few-shot learning.
-
- Date: December 8, 2022
MERL Contacts: Toshiaki Koike-Akino; Pu (Perry) Wang
Research Areas: Artificial Intelligence, Communications, Computational Sensing, Machine Learning, Signal Processing
Brief - On December 8, 2022, MERL researchers Toshiaki Koike-Akino and Pu (Perry) Wang gave a 3.5-hour tutorial presentation at the IEEE Global Communications Conference (GLOBECOM). The talk, titled "Post-Deep Learning Era: Emerging Quantum Machine Learning for Sensing and Communications," addressed recent trends, challenges, and advances in sensing and communications. P. Wang presented on use cases, industry trends, signal processing, and deep learning for Wi-Fi integrated sensing and communications (ISAC), while T. Koike-Akino discussed the future of deep learning, giving a comprehensive overview of artificial intelligence (AI) technologies, natural computing, emerging quantum AI, and their diverse applications. The tutorial was conducted remotely. MERL's quantum AI technology was partly reported in the recent press release (https://us.mitsubishielectric.com/en/news/releases/global/2022/1202-a/index.html).
The IEEE GLOBECOM is a highly anticipated event for researchers and industry professionals in the field of communications. Organized by the IEEE Communications Society, the flagship conference is known for its focus on driving innovation in all aspects of the field. Each year, over 3,000 scientific researchers submit proposals for program sessions at the annual conference. The theme of this year's conference was "Accelerating the Digital Transformation through Smart Communications," and featured a comprehensive technical program with 13 symposia, various tutorials and workshops.
-
- Date: November 11, 2022
MERL Contact: Avishai Weiss
Research Areas: Control, Dynamical Systems, Optimization
Brief - Avishai Weiss will give an invited talk at the William Maxwell Reed Seminar Series, Mechanical and Aerospace Engineering Department, University of Kentucky on "Fail-Safe Spacecraft Rendezvous." The talk will present some recent developments at MERL on guaranteeing safe rendezvous trajectories that avoid colliding with the target in the event of thruster anomalies.
-
- Date: October 26, 2022 - October 28, 2022
Where: American Modelica Conference 2022
MERL Contacts: Scott A. Bortoff; Christopher R. Laughman
Research Area: Multi-Physical Modeling
Brief - MERL researchers provided some key contributions to the 2022 American Modelica Conference, held October 26-28 at the University of Texas, Dallas. Chris Laughman, Senior Team Leader, Multiphysical Systems, was the Executive Coordinator of the conference, and worked to plan and stage the event. Scott A. Bortoff, Chief Scientist, gave a keynote address entitled "Sustainable HVAC: Research Opportunities for Modelicans." The talk posed the question: What are the modeling and control research challenges that, if addressed, will drive meaningful innovation in sustainable building HVAC systems in the next 20 years? In addition, the paper "Performance Enhancements for Zero-Flow Simulation of Vapor Compression Cycles," by Principal Research Scientist Hongtao Qiao and Chris Laughman, was a finalist for the conference Best Paper Award.
-