TR2026-033

Velocity Potential Neural Field for Efficient Ambisonics Impulse Response Modeling


    •  Masuyama, Y., Germain, F.G., Wichern, G., Hori, C., Le Roux, J., "Velocity Potential Neural Field for Efficient Ambisonics Impulse Response Modeling", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2026.
      BibTeX TR2026-033 PDF
      • @inproceedings{Masuyama2026may,
      • author = {Masuyama, Yoshiki and Germain, François G and Wichern, Gordon and Hori, Chiori and {Le Roux}, Jonathan},
      • title = {{Velocity Potential Neural Field for Efficient Ambisonics Impulse Response Modeling}},
      • booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
      • year = 2026,
      • month = may,
      • url = {https://www.merl.com/publications/TR2026-033}
      • }
  • MERL Contacts:
  • Research Areas:

    Artificial Intelligence, Machine Learning, Speech & Audio

Abstract:

First-order Ambisonics (FOA) is a standard spatial audio format based on spherical harmonic decomposition. Its zeroth- and first- order components capture the sound pressure and particle velocity, respectively. Recently, physics-informed neural networks have been applied to the spatial interpolation of FOA signals, regularizing the network outputs based on soft penalty terms derived from physical principles, e.g., the linearized momentum equation. In this paper, we reformulate the task so that the predicted FOA signal automatically satisfies the linearized momentum equation. Our network approximates a scalar function called velocity potential, rather than the FOA signal itself. Then, the FOA signal can be readily recovered through the partial derivatives of the velocity potential with respect to the network inputs (i.e., time and microphone position) according to physics of sound propagation. By deriving the four channels of FOA from the single-channel velocity potential, the reconstructed signal follows the physical principle at any time and position by construction. Experimental results on room impulse response reconstruction confirm the effectiveness of the proposed framework.