Bhiksha Raj

MERL Research / Technical Staff
Research Scientist
Ph.D., Carnegie Mellon University, 2000

Phone: (617) 621 7593
Email:



Dr. Bhiksha Raj joined MERL as a Staff Scientist. He completed his Ph.D. from Carnegie Mellon University (CMU) in May 2000. Dr. Raj works mainly on algorithmic aspects of speech recognition, with special emphasis on improving the robustness of speech recognition systems to environmental noise. His latest work is on the use of statistical information about speech for the automatic design of filter-and-sum microphone arrays. Dr. Raj has over fifty conference and journal publications and is currently in the process of publishing a book on missing-feature methods for noise-robust speech recognition.

Recent Projects:

Acoustic Doppler for Denoising Speech Signals
Acoustic Doppler Sensors for Surveillance
Audio Separation
SpokenQuery

Recent Publications:

Smaragdis, P.; Raj, B.; Shashanka, M., "Sparse and Shift-Invariant Feature Extraction from Non-Negative Data", IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), ISSN: 1520-6149, pp. 2069-2072, March 2008 (IEEE Xplore, TR2008-013)

Wilson, K.W.; Raj, B.; Smaragdis, P.; Divakaran, A., "Speech Denoising Using Nonnegative Matrix Factorization with Priors", IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), ISSN: 1520-6149, pp. 4029-4032, March 2008 (IEEE Xplore, TR2008-012)

Kalgaonkar, K.; Raj, B., "Ultrasonic Doppler Sensor for Speaker Recognition", IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), ISSN: 1520-6149, pp. 4865-4868, March 2008 (IEEE Xplore, TR2008-014)

Schmidt-Nielsen, B.; Harsham, B.; Raj, B.; Forlines, C., "Speech-Based UI Design for the Automobile", Handbook of Research on User Interface Design and Evaluation for Mobile Technology, ISBN: 978-1-59904-871-0, Vol. 1, Chapter XV, pp. 237-252, February 2008 (Information Science Reference, TR2008-006)

Vishvakarma, S.K.; Raj, B.; Singh, R.; Panda, C.R.; Saxena, A.K.; Sasgupta, S., "Analytical Modeling of Threshold Voltage for Nanoscale Symmetric Double Gate (SDG) MOSFET with Ultra Thin Body (UTB)", International Workshop on Physics of Semiconductor Devices, (IWPSD), ISBN: 978-1-4244-1728-5, pp. 277-280, December 2007 (IEEE Xplore)

Nandhitha, N.M.; Manoharan, N.; Rani, B.S.; Venkataraman, B.; Sundaram, P.K.; Raj, B., "Detection and Quantification of Tungsten Inclusion in Weld Thermographs for On-line Weld Monitoring by Region Growing and Morphological Image Processing Algorithms", International Conference on Computational Intelligence and Multimedia Applications, ISBN: 0-7695-3050-8, Vol. 3, pp. 513-518, December 2007 (IEEE Xplore)

Arulmozhi, N.; Manoharan, N.; Shella Rani, B.; Venkatraman, B.; Raj, B., "Isolation of Defects in Radiographic Weld Images with Wavelet Denoising Using Log-Gabor Filter", International Conference on Computational Intelligence and Multimedia Applications, ISBN: 0-7695-3050-8, Vol. 3, pp. 395-399, December 2007 (IEEE Xplore)

Smaragdis, P.; Raj, B., "Example-Driven Bandwidth Expansion", IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, ISBN: 978-1-4244-1620-2, pp. 135-138, October 2007 (IEEE Xplore, TR2007-089)

Kalgaonkar, K.; Hu, R.; Raj, B. , "Ultrasonic Doppler Sensor for Voice Activity Detection", IEEE Signal Processing Letters, ISSN: 1558-2361, Vol. 14, Issue 10, pp. 754-757, October 2007 (IEEE Xplore)

Kalgaonkar, K.; Raj, B., "Acoustic Doppler Sonar for Gait Recognition", IEEE Conference on Advanced Video and Signal Based Surveillance (AVSS), ISBN: 978-1-4244-1696-7, pp. 27-32, September 2007 (IEEE Xplore)

Reddy, A.M.; Raj, B., "Soft mask Methods for Single-Channel Speaker Separation", IEEE Transactions on Audio, Speech and Language Processing, ISSN: 1558-7916, Vol. 15, Issue, 6, pp. 1766-1776, August 2007 (IEEE Xplore)

Shashanka, M.V.S.; Raj, B.; Smaragdis, P., "Sparse Overcomplete Decomposition for Single Channel Speaker Separation", IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), ISSN: 1520-6149, Vol. 2, pp. 11-641 - II-644, April 2007 (IEEE Xplore, TR2007-031)

Raj, B.; Singh, R.; Shashanka, M.; Smaragdis, P., "Bandwidth Expansion with a Polya URN Model", IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vol. 4, pp. IV 597 - IV 600, April 2007 (IEEE Explore, TR2007-058)

Seltzer, M.L.; Raj, B.; Stern, R.M., "Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition", IEEE Transactions on Speech and Audio Processing, ISSN: 1063-6676, Vol. 12, Issue 5, pp.489-498, September 2004, awarded Best Young Author, March 2007 (IEEE Xplore, TR2004-088)

Weinberg, G.; Raj, B.; Kalgaonkar, K., "Two New Techniques for Natural Spoken User Interfaces", ACM Symposium on User Interface Software and Technology (UIST), October 2006 (UIST 2006, TR2006-098)

Raj, B.; Shashanka, M.V.S.; Smaragdis, P., "Latent Dirichlet Decomposition for Single Channel Speaker Separation", IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2006 (ICASSP 2006, TR2006-064)

Raj, B.; Singh, R., "Reconstructing Spectral Vectors with Uncertain Spectrographic Masks for Robust Speech Recognition", IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp. 27-32, November 2005 (IEEE Xplore, TR2005-160)

Hu, R.; Raj, B., "A Robust Voice Activity Detector Using an Acoustic Doppler Radar", IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 171-176, November 2005 (IEEE Xplore, TR2005-159)

Raj, B.; Smaragdis, P., "Latent Variable Decomposition of Spectrograms for Single Channel Speaker Separation", IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 17-20, October 2005 (IEEE Xplore, TR2005-137)

Forlines, C.; Schmidt-Nielsen, B.; Raj, B.; Wittenburg, K.; Wolf, P., "A Comparison between Spoken Queries and Menu-based Interfaces for In-Car Digital Music Selection", IFIP TC13 International Conference on Human-Computer Interaction (INTERACT), September 2005 (INTERACT 2005, TR2005-020)

Bansal, D.; Raj, B.; Smaragdis, P., "Bandwidth Expansion of Narrowband Speech Using non-Negative Matrix Factorization", Eurospeech, September 2005 (EUROSPEECH 2005, TR2005-135)

Raj, B.; Singh, R.; Smaragdis, P., "Recognizing Speech from Simultaneous Speakers", Eurospeech, September 2005 (EUROSPEECH 2005, TR2005-136)

Recent Technical Reports:

TR2007-083 Probabilistic Latent Variable Models as Non-Negative Factorizations
TR2007-062 Supervised and Semi-Supervised Separation of Sounds from Single-Channel Mixtures
TR2007-009 Shift-Invariant Probabilistic Latent Component Analysis