TR2015-030

MICbots: Collecting Large Realistic Datasets for Speech and Audio Research Using Mobile Robots

- Le Roux, J., Vincent, E., Hershey, J.R., Ellis, D.P.W., "Micbots: Collecting Large Realistic Datasets for Speech and Audio Research Using Mobile Robots", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), DOI: 10.1109/ICASSP.2015.7179050, April 2015, pp. 5635-5639.
  BibTeX TR2015-030 PDF
  - @inproceedings{LeRoux2015apr2,
  - author = {{Le Roux}, J. and Vincent, E. and Hershey, J.R. and Ellis, D.P.W.},
  - title = {{Micbots: Collecting Large Realistic Datasets for Speech and Audio Research Using Mobile Robots}},
  - booktitle = {IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
  - year = 2015,
  - pages = {5635--5639},
  - month = apr,
  - publisher = {IEEE},
  - doi = {10.1109/ICASSP.2015.7179050},
  - url = {https://www.merl.com/publications/TR2015-030}
  - }
MERL Contact:
- Jonathan
  Le Roux
Research Areas:

Artificial Intelligence, Speech & Audio, Robotics

Abstract:

Speech and audio signal processing research is a tale of data collection efforts and evaluation campaigns. Large benchmark datasets for automatic speech recognition (ASR) have been instrumental in the advancement of speech recognition technologies. However, when it comes to robust ASR, source separation, and localization, especially using microphone arrays, the perfect dataset is out of reach, and many different data collection efforts have each made different compromises between the conflicting factors in terms of realism, ground truth, and costs. Our goal here is to escape some of the most difficult trade-offs by proposing MICbots, a low-cost method of collecting large amounts of realistic data where annotations and ground truth are readily available. Our key idea is to use freely moving robots equiped with microphones and loudspeakers, playing recorded utterances from existing (already annotated) speech datasets. We give an overview of previous data collection efforts and the trade-offs they make, and describe the benefits of using our robot-based approach. We finally explain the use of this method to collect room impulse response measurement.

Related News & Events

NEWS Multimedia Group researchers presented 8 papers at ICASSP 2015
Date: April 19, 2015 - April 24, 2015
Where: IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP)
MERL Contacts: Anthony Vetro; Hassan Mansour; Petros T. Boufounos; Jonathan Le Roux
Brief
- Multimedia Group researchers have presented 8 papers at the recent IEEE International Conference on Acoustics, Speech & Signal Processing, which was held in Brisbane, Australia from April 19-24, 2015.

MERL Contact:

JonathanLe Roux

Research Areas:

Abstract:

Jonathan
Le Roux