TR2012-072

Structured Discriminative Models for Speech Recognition


    •  Gales, M.; Watanabe, S.; Fosler-Lussier, E., "Structured Discriminative Models For Speech Recognition", IEEE Signal Processing Magazine, Vol. 29, No. 6, pp. 70-81, November 2012.
      BibTeX Download PDF
      • @article{Gales2012nov,
      • author = {Gales, M. and Watanabe, S. and Fosler-Lussier, E.},
      • title = {Structured Discriminative Models For Speech Recognition},
      • journal = {IEEE Signal Processing Magazine},
      • year = 2012,
      • volume = 29,
      • number = 6,
      • pages = {70--81},
      • month = nov,
      • url = {http://www.merl.com/publications/TR2012-072}
      • }
  • Research Areas:

    Multimedia, Speech & Audio


Automatic Speech Recognition (ASR) systems classify structured sequence data, where the label sequences (sentences) must be inferred from the observation sequences (the acoustic waveform). The sequential nature of the task is one of the reasons why generative classifiers, based on combining hidden Markov model (HMM) acoustic models and N-gram language models using Bayes' rule, have become the dominant technology used in ASR. Conversely, the machine learning and natural language processing (NLP) research areas are increasingly dominated by discriminative approaches, where the class posteriors are directly modelled. This paper describes recent work in the area of structured discriminative models for ASR. To handle continuous, variable length, observation sequences, the approaches applied to NLP tasks must be modified. This paper discusses a variety of approaches for applying structured discriminative models to ASR, both from the current literature and possible future approaches. We concentrate on structured models themselves, the descriptive features of observations commonly used within the models, and various options for optimizing the parameters of the model.