TR2025-120

Direction-Aware Neural Acoustic Fields for Few-Shot Interpolation of Ambisonic Impulse Responses

- Ick, C., Wichern, G., Masuyama, Y., Germain, F.G., Le Roux, J., "Direction-Aware Neural Acoustic Fields for Few-Shot Interpolation of Ambisonic Impulse Responses", Interspeech, DOI: 10.21437/Interspeech.2025-1912, August 2025, pp. 933-937.
  BibTeX TR2025-120 PDF
  - @inproceedings{Ick2025aug,
  - author = {Ick, Christopher and Wichern, Gordon and Masuyama, Yoshiki and Germain, François G and {Le Roux}, Jonathan},
  - title = {{Direction-Aware Neural Acoustic Fields for Few-Shot Interpolation of Ambisonic Impulse Responses}},
  - booktitle = {Interspeech},
  - year = 2025,
  - pages = {933--937},
  - month = aug,
  - doi = {10.21437/Interspeech.2025-1912},
  - url = {https://www.merl.com/publications/TR2025-120}
  - }
MERL Contacts:
Research Areas:

Artificial Intelligence, Machine Learning, Speech & Audio

Abstract:

The characteristics of a sound field are intrinsically linked to the geometric and spatial properties of the environment surrounding a sound source and a listener. The physics of sound propagation is captured in a time-domain signal known as a room im- pulse response (RIR). Prior work using neural fields (NFs) has allowed learning spatially-continuous representations of RIRs from finite RIR measurements. However, previous NF-based methods have focused on monaural omnidirectional or at most binaural listeners, which does not precisely capture the directional characteristics of a real sound field at a single point. We propose a direction-aware neural field (DANF) that more explicitly incorporates the directional information by Ambisonic- format RIRs. While DANF inherently captures spatial relations between sources and listeners, we further propose a direction- aware loss. In addition, we investigate DANF’s ability to adapt to new rooms in various ways including low-rank adaptation. Index Terms: spatial audio, neural acoustic field, room impulse response, Ambisonics

Related News & Events

EVENT SANE 2025 - Speech and Audio in the Northeast
Date: Friday, November 7, 2025
Location: Google, New York, NY
MERL Contacts: Jonathan Le Roux; Yoshiki Masuyama
Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
Brief
- SANE 2025, a one-day event gathering researchers and students in speech and audio from the Northeast of the American continent, was held on Friday November 7, 2025 at Google, in New York, NY.
  
  It was the 12th edition in the SANE series of workshops, which started in 2012 and is typically held every year alternately in Boston and New York. Since the first edition, the audience has grown to about 200 participants and 50 posters each year, and SANE has established itself as a vibrant, must-attend event for the speech and audio community across the northeast and beyond.
  
  SANE 2025 featured invited talks by six leading researchers from the Northeast as well as from the wider community: Dan Ellis (Google Deepmind), Leibny Paola Garcia Perera (Johns Hopkins University), Yuki Mitsufuji (Sony AI), Julia Hirschberg (Columbia University), Yoshiki Masuyama (MERL), and Robin Scheibler (Google Deepmind). It also featured a lively poster session with 50 posters.
  
  MERL Speech and Audio Team's Yoshiki Masuyama presented a well-received overview of the team's recent work on "Neural Fields for Spatial Audio Modeling". His talk highlighted how neural fields are reshaping spatial audio research by enabling flexible, data-driven interpolation of head-related transfer functions and room impulse responses. He also discussed the integration of sound-propagation physics into neural field models through physics-informed neural networks, showcasing MERL’s advances at the intersection of acoustics and deep learning.
  
  SANE 2025 was co-organized by Jonathan Le Roux (MERL), Quan Wang (Google Deepmind), and John R. Hershey (Google Deepmind). SANE remained a free event thanks to generous sponsorship by Google, MERL, Apple, Bose, and Carnegie Mellon University.
  
  Slides and videos of the talks are available from the SANE workshop website and via a YouTube playlist.

Related Publication

Ick, C., Wichern, G., Masuyama, Y., Germain, F.G., Le Roux, J., "Direction-Aware Neural Acoustic Fields for Few-Shot Interpolation of Ambisonic Impulse Responses", arXiv, May 2025.

BibTeX arXiv

@article{Ick2025may,
author = {Ick, Christopher and Wichern, Gordon and Masuyama, Yoshiki and Germain, François G and {Le Roux}, Jonathan},
title = {{Direction-Aware Neural Acoustic Fields for Few-Shot Interpolation of Ambisonic Impulse Responses}},
journal = {arXiv},
year = 2025,
month = may,
url = {https://arxiv.org/abs/2505.13617}
}

MERL Contacts:

GordonWichern

YoshikiMasuyama

JonathanLe Roux

Research Areas:

Abstract:

Gordon
Wichern

Yoshiki
Masuyama

Jonathan
Le Roux