Julius Richter

Email:

Position:
Research / Technical Staff

Visiting Research Scientist
Education:
Ph.D. in Computer Science, University of Hamburg, Germany, 2025
Research Areas:
External Links:
- Google Scholar

Julius' Quick Links

Biography
Awards
MERL Publications
Other Publications

Biography

Julius's research interests include generative models and multimodal learning for audio-visual understanding and restoration. During his Ph.D., he developed novel diffusion-based generative approaches for single-channel speech enhancement. Prior to joining MERL, he was a postdoctoral researcher at Meta Superintelligence Labs.
Awards
- AWARD MERL Team Wins DCASE 2026 Challenge on Anomalous Sound Detection for Machine Condition Monitoring
  Date: June 30, 2026
  Awarded to: Takuya Fujimura, Gordon Wichern, Yoshiki Masuyama, Christoph Boeddeker, Kohei Saijo, Julius Richter, Takahiro Edo, and Jonathan Le Roux
  MERL Contacts: Christoph Boeddeker; Takahiro Edo; Jonathan Le Roux; Yoshiki Masuyama; Julius Richter; Gordon Wichern
  Research Areas: Artificial Intelligence, Machine Learning, Signal Processing, Speech & Audio
  Brief
  - MERL's Speech & Audio team ranked 1st out of 51 teams in the DCASE 2026 Challenge’s Task 2, “Noise-aware Unsupervised Anomalous Sound Detection for Machine Condition Monitoring.” The team was led by MERL intern Takuya Fujimura, and also included Gordon Wichern, Yoshiki Masuyama, Christoph Boeddeker, Kohei Saijo, Julius Richter, Takahiro Edo, and Jonathan Le Roux.
    
    The IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE Challenge), started in 2013, has been organized yearly since 2016, and gathers challenges on multiple tasks related to the detection, analysis, and generation of sound events. This year, the DCASE 2026 Challenge received 421 submissions from 135 teams across seven tasks.
    
    The MERL team won Task 2, Noise-aware Unsupervised Anomalous Sound Detection for Machine Condition Monitoring, which aims at building noise-robust systems for automatically detecting machine failure via microphones when only normal machine operating data is available for system development. Task 2 was by far the most popular out of the 7 DCASE 2026 tasks, with 51 teams submitting 168 entries. The MERL team's system was built around MERL’s recently proposed paradigm of noise-aware self-supervised learning, which extracts noise robust features leveraging two-channel recordings, in which one microphone is used to capture noise. Anomaly detection is then performed in the extracted denoised feature space using advanced score normalization. The team's best submission obtained a composite score of 70.24% on five evaluation machines, largely outperforming the 2nd best team's 65.45%.
    
    MERL also participated in Task 4, Spatial Semantic Segmentation of Sound Scenes (S5) and placed 3rd out of 10 teams in separation performance. Our cascaded system consists of universal sound separation with source counting, source classification, and class-aware refinement, where the separation and refinement modules are built upon MERL's TF-Locoformer separation technology. Notably, the team's best submission obtained a label prediction accuracy of 76.92% on the evaluation set, largely outperforming the 2nd best team's 65.54%.
See All Awards for MERL
MERL Publications
- Fujimura, T., Masuyama, Y., Wichern, G., Boeddeker, C., Richter, J., Le Roux, J., "NABEATs: Noise-Aware Audio Representation Learning", arXiv, July 2026.
  BibTeX arXiv
  - @article{Fujimura2026jul,
  - author = {Fujimura, Takuya and Masuyama, Yoshiki and Wichern, Gordon and Boeddeker, Christoph and Richter, Julius and {Le Roux}, Jonathan},
  - title = {{NABEATs: Noise-Aware Audio Representation Learning}},
  - journal = {arXiv},
  - year = 2026,
  - month = jul,
  - url = {https://arxiv.org/abs/2607.16688}
  - }
- Klement, D., Masuyama, Y., Boeddeker, C., Saijo, K., Richter, J., Wichern, G., Le Roux, J., "Technical Report for MERL’s Real-TSE Challenge Submission," Tech. Rep. TR2026-112, Mitsubishi Electric Research Laboratories, July 2026.
  BibTeX TR2026-112 PDF
  - @techreport{Klement2026jul2,
  - author = {Klement, Dominik and Masuyama, Yoshiki and Boeddeker, Christoph and Saijo, Kohei and Richter, Julius and Wichern, Gordon and {Le Roux}, Jonathan},
  - title = {{Technical Report for MERL’s Real-TSE Challenge Submission}},
  - institution = {Real-TSE Challenge},
  - year = 2026,
  - month = jul,
  - url = {https://www.merl.com/publications/TR2026-112}
  - }
- Fujimura, T., Wichern, G., Masuyama, Y., Boeddeker, C., Saijo, K., Richter, J., Edo, T., Le Roux, J., "The MERL Systems for DCASE 2026 Challenge Task 2," Tech. Rep. TR2026-100, IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE Challenge), June 2026.
  BibTeX TR2026-100 PDF
  - @techreport{Fujimura2026jun,
  - author = {{Fujimura, Takuya and Wichern, Gordon and Masuyama, Yoshiki and Boeddeker, Christoph and Saijo, Kohei and Richter, Julius and Edo, Takahiro and Le Roux, Jonathan}},
  - title = {{The MERL Systems for DCASE 2026 Challenge Task 2}},
  - institution = {IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE Challenge)},
  - year = 2026,
  - month = jun,
  - url = {https://www.merl.com/publications/TR2026-100}
  - }
- Saijo, K., Masuyama, Y., Boeddeker, C., Wichern, G., Richter, J., Edo, T., Le Roux, J., "The MERL Systems for DCASE 2026 Challenge Task 4," Tech. Rep. TR2026-098, IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE Challenge), June 2026.
  BibTeX TR2026-098 PDF
  - @techreport{Saijo2026jun,
  - author = {{Saijo, Kohei and Masuyama, Yoshiki and Boeddeker, Christoph and Wichern, Gordon and Richter, Julius and Edo, Takahiro and Le Roux, Jonathan}},
  - title = {{The MERL Systems for DCASE 2026 Challenge Task 4}},
  - institution = {IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE Challenge)},
  - year = 2026,
  - month = jun,
  - url = {https://www.merl.com/publications/TR2026-098}
  - }
- Richter, J., Masuyama, Y., Boeddeker, C., Edo, T., Wichern, G., Le Roux, J., "Predictive-Generative Drift Decomposition for Speech Enhancement and Separation", arXiv, May 2026.
  BibTeX arXiv
  - @article{Richter2026may,
  - author = {{Richter, Julius and Masuyama, Yoshiki and Boeddeker, Christoph and Edo, Takahiro and Wichern, Gordon and Le Roux, Jonathan}},
  - title = {{Predictive-Generative Drift Decomposition for Speech Enhancement and Separation}},
  - journal = {arXiv},
  - year = 2026,
  - month = may,
  - url = {https://arxiv.org/abs/2605.06189}
  - }
Other Publications
- Julius Richter, Danilo de Oliveira and Timo Gerkmann, "Do We Need EMA for Diffusion-Based Speech Enhancement? Toward a Magnitude-Preserving Network Architecture", Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2026.
  BibTeX
  - @Inproceedings{Richter2026ICASSPEDM2SE,
  - author = {Richter, Julius and de Oliveira, Danilo and Gerkmann, Timo},
  - title = {Do We Need EMA for Diffusion-Based Speech Enhancement? Toward a Magnitude-Preserving Network Architecture},
  - booktitle = {Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
  - year = 2026
  - }
- Jean-Marie Lemercier, Julius Richter, Simon Welker, Eloi Moliner, Vesa Välimäki and Timo Gerkmann, "Diffusion Models for Audio Restoration", IEEE Signal Processing Magazine, Vol. 41, No. 6, pp. 72-84, 2025.
  BibTeX
  - @Article{Lemercier2025SPMDiffusion,
  - author = {Lemercier, Jean-Marie and Richter, Julius and Welker, Simon and Moliner, Eloi and V{\"a}lim{\"a}ki, Vesa and Gerkmann, Timo},
  - title = {Diffusion Models for Audio Restoration},
  - journal = {IEEE Signal Processing Magazine},
  - year = 2025,
  - volume = 41,
  - number = 6,
  - pages = {72--84}
  - }
- Julius Richter, Danilo de Oliveira and Timo Gerkmann, "Investigating Training Objectives for Generative Speech Enhancement", Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025.
  BibTeX
  - @Inproceedings{Richter2025ICASSPObjectives,
  - author = {Richter, Julius and de Oliveira, Danilo and Gerkmann, Timo},
  - title = {Investigating Training Objectives for Generative Speech Enhancement},
  - booktitle = {Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
  - year = 2025
  - }
- Julius Richter, Till Svajda and Timo Gerkmann, "ReverbFX: A Dataset of Room Impulse Responses Derived from Reverb Effect Plugins for Singing Voice Dereverberation", Proceedings of the ITG Conference on Speech Communication, 2025.
  BibTeX
  - @Inproceedings{Richter2025ITGReverbFX,
  - author = {Richter, Julius and Svajda, Till and Gerkmann, Timo},
  - title = {{ReverbFX}: A Dataset of Room Impulse Responses Derived from Reverb Effect Plugins for Singing Voice Dereverberation},
  - booktitle = {Proceedings of the ITG Conference on Speech Communication},
  - year = 2025
  - }
- Danilo de Oliveira, Julius Richter, Jean-Marie Lemercier, Simon Welker and Timo Gerkmann, "Non-intrusive Speech Quality Assessment with Diffusion Models Trained on Clean Speech", Proceedings of Interspeech, 2025.
  BibTeX
  - @Inproceedings{deOliveira2025InterspeechLikelihood,
  - author = {de Oliveira, Danilo and Richter, Julius and Lemercier, Jean-Marie and Welker, Simon and Gerkmann, Timo},
  - title = {Non-intrusive Speech Quality Assessment with Diffusion Models Trained on Clean Speech},
  - booktitle = {Proceedings of Interspeech},
  - year = 2025
  - }
- Bunlong Lay, Jean-Marie Lemercier, Julius Richter and Timo Gerkmann, "Single and Few-step Diffusion for Generative Speech Enhancement", Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024.
  BibTeX
  - @Inproceedings{Lay2024ICASSPFewStep,
  - author = {Lay, Bunlong and Lemercier, Jean-Marie and Richter, Julius and Gerkmann, Timo},
  - title = {Single and Few-step Diffusion for Generative Speech Enhancement},
  - booktitle = {Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
  - year = 2024
  - }
- Julius Richter, Yi-Chiao Wu, Steven Krenn, Simon Welker, Bunlong Lay, Shinji Watanabe, Alexander Richard and Timo Gerkmann, "EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation", Proceedings of Interspeech, 2024.
  BibTeX
  - @Inproceedings{Richter2024InterspeechEARS,
  - author = {Richter, Julius and Wu, Yi-Chiao and Krenn, Steven and Welker, Simon and Lay, Bunlong and Watanabe, Shinji and Richard, Alexander and Gerkmann, Timo},
  - title = {{EARS}: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation},
  - booktitle = {Proceedings of Interspeech},
  - year = 2024
  - }
- Julius Richter and Timo Gerkmann, "Diffusion-based Speech Enhancement: Demonstration of Performance and Generalization", Audio Imagination Workshop at NeurIPS, 2024.
  BibTeX
  - @Inproceedings{Richter2024NeurIPSAudioImagination,
  - author = {Richter, Julius and Gerkmann, Timo},
  - title = {Diffusion-based Speech Enhancement: Demonstration of Performance and Generalization},
  - booktitle = {Audio Imagination Workshop at NeurIPS},
  - year = 2024
  - }
- Julius Richter, Simon Welker, Jean-Marie Lemercier, Bunlong Lay, Tal Peer and Timo Gerkmann, "Causal Diffusion Models for Generalized Speech Enhancement", IEEE Open Journal of Signal Processing, Vol. 5, pp. 780-789, 2024.
  BibTeX
  - @Article{Richter2024OJSPCausal,
  - author = {Richter, Julius and Welker, Simon and Lemercier, Jean-Marie and Lay, Bunlong and Peer, Tal and Gerkmann, Timo},
  - title = {Causal Diffusion Models for Generalized Speech Enhancement},
  - journal = {IEEE Open Journal of Signal Processing},
  - year = 2024,
  - volume = 5,
  - pages = {780--789}
  - }
- Bunlong Lay, Simon Welker, Julius Richter and Timo Gerkmann, "Reducing the Prior Mismatch of Stochastic Differential Equations for Diffusion-based Speech Enhancement", Proceedings of Interspeech, 2023.
  BibTeX
  - @Inproceedings{Lay2023InterspeechPriorMismatch,
  - author = {Lay, Bunlong and Welker, Simon and Richter, Julius and Gerkmann, Timo},
  - title = {Reducing the Prior Mismatch of Stochastic Differential Equations for Diffusion-based Speech Enhancement},
  - booktitle = {Proceedings of Interspeech},
  - year = 2023
  - }
- Jean-Marie Lemercier, Julius Richter, Simon Welker and Timo Gerkmann, "StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation", IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 31, pp. 2724-2737, 2023.
  BibTeX
  - @Article{Lemercier2023TASLPStoRM,
  - author = {Lemercier, Jean-Marie and Richter, Julius and Welker, Simon and Gerkmann, Timo},
  - title = {{StoRM}: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation},
  - journal = {IEEE/ACM Transactions on Audio, Speech, and Language Processing},
  - year = 2023,
  - volume = 31,
  - pages = {2724--2737}
  - }
- Hector Martel, Julius Richter, Kai Li, Xiaolin Hu and Timo Gerkmann, "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model", Proceedings of Interspeech, 2023.
  BibTeX
  - @Inproceedings{Martel2023InterspeechAVSep,
  - author = {Martel, Hector and Richter, Julius and Li, Kai and Hu, Xiaolin and Gerkmann, Timo},
  - title = {Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model},
  - booktitle = {Proceedings of Interspeech},
  - year = 2023
  - }
- Julius Richter, Simon Welker, Jean-Marie Lemercier, Bunlong Lay, Tal Peer and Timo Gerkmann, "Speech Signal Improvement Using Causal Generative Diffusion Models", Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023.
  BibTeX
  - @Inproceedings{Richter2023ICASSPSpeechImprovement,
  - author = {Richter, Julius and Welker, Simon and Lemercier, Jean-Marie and Lay, Bunlong and Peer, Tal and Gerkmann, Timo},
  - title = {Speech Signal Improvement Using Causal Generative Diffusion Models},
  - booktitle = {Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
  - year = 2023
  - }
- Julius Richter, Simone Frintrop and Timo Gerkmann, "Audio-Visual Speech Enhancement with Score-Based Generative Models", Proceedings of the ITG Conference on Speech Communication, 2023.
  BibTeX
  - @Inproceedings{Richter2023ITGAVScore,
  - author = {Richter, Julius and Frintrop, Simone and Gerkmann, Timo},
  - title = {Audio-Visual Speech Enhancement with Score-Based Generative Models},
  - booktitle = {Proceedings of the ITG Conference on Speech Communication},
  - year = 2023
  - }
- Julius Richter, Simon Welker, Jean-Marie Lemercier, Bunlong Lay and Timo Gerkmann, "Speech Enhancement and Dereverberation with Diffusion-Based Generative Models", IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 31, pp. 2351-2364, 2023.
  BibTeX
  - @Article{Richter2023TASLPDiffusion,
  - author = {Richter, Julius and Welker, Simon and Lemercier, Jean-Marie and Lay, Bunlong and Gerkmann, Timo},
  - title = {Speech Enhancement and Dereverberation with Diffusion-Based Generative Models},
  - journal = {IEEE/ACM Transactions on Audio, Speech, and Language Processing},
  - year = 2023,
  - volume = 31,
  - pages = {2351--2364}
  - }
- Danilo de Oliveira, Julius Richter, Jean-Marie Lemercier, Tal Peer and Timo Gerkmann, "On the Behavior of Intrusive and Non-intrusive Speech Enhancement Metrics in Predictive and Generative Settings", Proceedings of the ITG Conference on Speech Communication, 2023.
  BibTeX
  - @Inproceedings{deOliveira2023ITGMetrics,
  - author = {de Oliveira, Danilo and Richter, Julius and Lemercier, Jean-Marie and Peer, Tal and Gerkmann, Timo},
  - title = {On the Behavior of Intrusive and Non-intrusive Speech Enhancement Metrics in Predictive and Generative Settings},
  - booktitle = {Proceedings of the ITG Conference on Speech Communication},
  - year = 2023
  - }
- Julius Richter, Jeanine Liebold and Timo Gerkmann, "Continuous Phoneme Recognition based on Audio-Visual Modality Fusion", Proceedings of the IEEE World Congress on Computational Intelligence, 2022.
  BibTeX
  - @Inproceedings{Richter2022WCCIAVPhoneme,
  - author = {Richter, Julius and Liebold, Jeanine and Gerkmann, Timo},
  - title = {Continuous Phoneme Recognition based on Audio-Visual Modality Fusion},
  - booktitle = {Proceedings of the IEEE World Congress on Computational Intelligence},
  - year = 2022
  - }
- Simon Welker, Julius Richter and Timo Gerkmann, "Speech Enhancement with Score-Based Generative Models in the Complex STFT Domain", Proceedings of Interspeech, 2022.
  BibTeX
  - @Inproceedings{Welker2022InterspeechComplex,
  - author = {Welker, Simon and Richter, Julius and Gerkmann, Timo},
  - title = {Speech Enhancement with Score-Based Generative Models in the Complex {STFT} Domain},
  - booktitle = {Proceedings of Interspeech},
  - year = 2022
  - }
- Guillaume Carbajal, Julius Richter and Timo Gerkmann, "Guided Variational Autoencoder for Speech Enhancement with a Supervised Classifier", Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021.
  BibTeX
  - @Inproceedings{Carbajal2021ICASSPGuidedVAE,
  - author = {Carbajal, Guillaume and Richter, Julius and Gerkmann, Timo},
  - title = {Guided Variational Autoencoder for Speech Enhancement with a Supervised Classifier},
  - booktitle = {Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
  - year = 2021
  - }
- Guillaume Carbajal, Julius Richter and Timo Gerkmann, "Disentanglement Learning for Variational Autoencoders Applied to Audio-Visual Speech Enhancement", Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2021.
  BibTeX
  - @Inproceedings{Carbajal2021WASPAA,
  - author = {Carbajal, Guillaume and Richter, Julius and Gerkmann, Timo},
  - title = {Disentanglement Learning for Variational Autoencoders Applied to Audio-Visual Speech Enhancement},
  - booktitle = {Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
  - year = 2021
  - }
- Quan Nguyen, Julius Richter, Mikko Lauri, Timo Gerkmann and Simone Frintrop, "Improving Mix-and-Separate Training in Audio-Visual Sound Source Separation with an Object Prior", Proceedings of the International Conference on Pattern Recognition (ICPR), 2020.
  BibTeX
  - @Inproceedings{Nguyen2020ICPRAVPrior,
  - author = {Nguyen, Quan and Richter, Julius and Lauri, Mikko and Gerkmann, Timo and Frintrop, Simone},
  - title = {Improving Mix-and-Separate Training in Audio-Visual Sound Source Separation with an Object Prior},
  - booktitle = {Proceedings of the International Conference on Pattern Recognition (ICPR)},
  - year = 2020
  - }
- Julius Richter, Guillaume Carbajal and Timo Gerkmann, "Speech Enhancement with Stochastic Temporal Convolutional Networks", Proceedings of Interspeech, 2020.
  BibTeX
  - @Inproceedings{Richter2020InterspeechTCN,
  - author = {Richter, Julius and Carbajal, Guillaume and Gerkmann, Timo},
  - title = {Speech Enhancement with Stochastic Temporal Convolutional Networks},
  - booktitle = {Proceedings of Interspeech},
  - year = 2020
  - }