Publications | UCSD DSP LAB

Speech Processing

2016

N Radmanesh, BD Rao, "Frequency-based customization of multizone sound system design," 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 375-379
N Radmanesh, IS Burnett, BD Rao, "A lasso-LS optimization with a frequency variable dictionary in a multizone sound system," IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24 (3), 583-593

2013

A. Masnadi-Shirazi and Bhaskar D. Rao, "An ICA-SCT-PHD Filter Approach for Tracking and Separation of Unknown Time-Varying Number of Sources," IEEE Transactions on Audio, Speech and Language Processing, Volume: 21, Issue: 4, Page(s): 828 - 841, April 2013

2012

Alireza Masnadi-Shirazi and Bhaskar D. Rao, "An ICA-Based RFS Approach for DOA Tracking of Unknown Time-Varying Number of Sources," European Signal Processing Conference, August 2012

2011

Wenyi Zhang, Alireza Masnadi-Shirazi, Bhaskar D. Rao, "Insights into the Frequency Domain ICA/IVA Approach," IEEE Asilomar Conference on Signals, Systems and Computers, CA, November 2011
Chandra Murthy, Ethan Duni and B. D. Rao, "High-Rate Vector Quantization for Noisy Channels With Applications to Wideband Speech Spectrum Compression," IEEE Transactions on Signal Processing, Vol. 59, No. 11, pages 5390-5403, November 2011
A. Masnadi-Shirazi and B.D. Rao, "Separation and tracking of multiple speakers in a reverberant environment using a multiple model particle filter glimpsing method," IEEE International Conference on Acoustics, Speech and Signal Analysis, Prague, Czech Republic, pages: 2516 - 2519, May 22, 2011

2010

W. Zhang and B.D. Rao, "A Two Microphone-Based Approach for Source Localization of Multiple Speech Sources," IEEE Transactions on Audio, Speech and Language Processing, Vol. 18, No. 8 , pages: 1913-1928, November 2010
S.T. Shivappa, B.D. Rao, and M.M. Trivedi, "Audio-Visual Fusion and Tracking with Multilevel Iterative Decoding: Framework and Experimental Evaluation," IEEE Journal of Selected Topics in Signal Processing, Special issue on Speech Processing for Natural, Vol. 4, No. 5, pages: 882-894, October 2010
S.T. Shivappa, M.M. Trivedi, and B.D. Rao, "Audiovisual Information Fusion in Human- Computer Interfaces and Intelligent Environments: a survey," Proceedings of the IEEE, Vol. 12, No. 6, pages: 502-509, October 2010
A. Masnadi-Shirazi, W. Zhang, and B.D. Rao, "Glimpsing IVA: A Framework for Overcomplete/Complete/Undercomplete Convolutive Source Separation," IEEE Transactions on Audio, Speech and Language Processing, Vol. 18, No. 7 , pages: 1841-1855, September 2010
S. Shivappa, B. D. Rao, and M. M. Trivedi, "Audio Visual Fusion and Tracking With Multilevel Iterative Decoding: Framework and Experimental Evaluation," IEEE Journal of Selected Topics in Signal Processing, July 15, 2010
A. M.-Shirazi, W. Zhang, B. D. Rao, "Glimpsing Independent Vector Analysis: Separating More Sources Than Sensors Using Active and Inactive States," IEEE International Conference on Acoustics, Speech, and Signal Processing, Dallas, TX, March 2010

2009

R.M. Hegde, J.Kurniawan, and B.D. Rao, "On the Design and Prototype Implementation of a Multimodal Situation Aware System," IEEE Transactions on Multimedia, pages: 645 - 657, Vol. 11, Issue 4, June 2009
S. T. Shivappa, M. M. Trivedi, and B. D. Rao, "Hierarchical Audio-Visual Cue Integration Framework for Activity Analysis in Intelligent Meeting Rooms," IEEE CVPR Joint Workshop for Visual and Contextual Learning and Visual Scene Understanding, pages: 107-114, June 2009
A. M-Shirazi and B.D. Rao, "Independent Vector Analysis Incorporating Active and Inactive States," IEEE International Conference on Acoustics, Speech, and Signal Processing, Taipei, Taiwan, April 2009
S. T. Shivappa, B. D. Rao and M. M. Trivedi, "Role of Head Pose Estimation in Speech Acquisition from Distant Microphones," IEEE International Conference on Acoustics, Speech, and Signal Processing, Taipei, Taiwan, April 2009
W. Zhang and B.D. Rao, "Combining Independent Component Analysis with Geometric Information and its Application to Speech Processing," IEEE International Conference on Acoustics, Speech, and Signal Processing, Taipei, Taiwan, April 2009
W. Zhang and B.D. Rao, "Two Microphone Based Direction of Arrival Estimation for Multiple Speech Sources using Spectral Properties of Speech," IEEE International Conference on Acoustics, Speech, and Signal Processing, Taipei, Taiwan, April 2009

2008

S. T. Shivappa, M. M. Trivedi and B. D. Rao, "Person Tracking With Audio-visual Cues Using the Iterative Decoding Framework," IEEE International Conference on Advanced Video and Signal Surveillance, Santa Fe, New Mexico, September 2008
E. R. Duni and B. D. Rao, "Online Training Methods for Gaussian Mixture Vector Quantizers," IEEE International Conference on Acoustics, Speech, and Signal Processing, Las Vegas, Pages: 4785 - 4788, April 2008
S. T. Shivappa, B. D. Rao and M. M. Trivedi, "Multimodal Information Fusion Using the Iterative Decoding Algorithm and its Application to Audio-Visual Speech Recognition," IEEE International Conference on Acoustics, Speech, and Signal Processing, Las Vegas, Pages: 2241 - 2244, April 2008
S. T. Shivappa, B. D. Rao and M. M. Trivedi, "An Iterative Decoding Algorithm for Fusion of Multimodal Information," EURASIP Journal on Advances in Signal Processing, Number: 478396, February 2008

2007

E. R. Duni and B. D. Rao, "Performance of Speaker-Dependent Wideband Speech Coding," Interspeech, Antwerp, Belgium, August 2007
R. Hegde, Y. Jin, and B. D. Rao, "Spectral Estimation of Voiced Speech Using a Family of MVDR Estimates," IEEE International Conference on Acoustics, Speech, and Signal Processing, Hawaii, Vol. 4, Pages: 1069 - 1072, April 2007
E. R. Duni and B. D. Rao, "A High-Rate Optimal Transform Coder with Gaussian Mixture Companders," IEEE Transactions on Audio, Speech and Language Processing, Vol. 15, Issue 3, Pages: 770-783, March 2007
E. R. Duni and B. D. Rao, "High-Rate Optimized Recursive Vector Quantization Structures Using Hidden Markov Models," IEEE Transactions on Audio, Speech and Language Processing, Vol. 15, Issue 3, Pages: 756-769, March 2007
S. Dharanipragada, U. H. Yapanel, and B. D. Rao, "Robust Feature Extraction for Continuous Speech Recognition using the MVDR Spectrum Estimation Method," IEEE Transactions on Speech, Audio and Language Processing, Vol. 15, Issue 1, Pages: 224 - 234, January 2007

2006

W. Zhang and B. D. Rao, "Robust Adaptive Beamformer with Feasibility Constraint on the Steering Vector," European Signal Processing Conference, September 2006
C. R. Murthy, E. R. Duni and B. D. Rao, "High-Rate Analysis of Vector Quantization for Noisy Channels," IEEE International Conference on Acoustics, Speech, and Signal Processing, Tolouse, France, Vol. 4, Pages: 193 - 196, May 2006
E. R. Duni and B. D. Rao, "High-Rate Design of Transform Coders with Gaussian Mixture Companders," IEEE International Conference on Acoustics, Speech, and Signal Processing, Tolouse, France, Vol. 1, Pages: 693 - 696, May 2006
R. M. Hegde, B.S. Manoj, B. D. Rao, and R. R. Rao, "Emotion Detection from Speech Signals and its Applications in Supporting Enhanced QoS in Emergency Response," Third International Conference on Information Systems for Crisis Response and Management, Newark, USA, May 2006
W. Zhang and B. D. Rao, "Robust Broadband Beam former With Diagonally Loaded Constraint Matrix and Its Application to Speech Recognition," IEEE International Conference on Acoustics, Speech, and Signal Processing, Tolouse, France, Vol. 1, Pages: 785 - 788, May 2006
A. D. Subramaniam, B. D. Rao, and W. R. Gardner, "Low-Complexity Source Coding Using Gaussian Mixture Models, Lattice Vector Quantization and Recursive Coding with Application to Speech Spectrum Quantization," IEEE Transactions on Speech and Audio Processing, Vol. 14, Issue. 2, Pages: 524 - 532, March 2006
E. R. Duni and B. D. Rao, "High-Rate Training of Gaussian Mixture Vector Quantizers," Data Compression Conference, Page 1, March 2006
A. D. Subramaniam, B. D. Rao, and W. R. Gardner, "Iterative Joint Source-Channel Decoding of Speech Spectrum Parameters over an Additive White Gaussian Noise Channel," IEEE Transactions on Speech and Audio Processing, Vol. 14, Issue. 1, Pages: 152 - 162, January 2006

2004

A. D. Subramaniam, W. R. Gardner, B. D. Rao, "Joint Source-Channel Decoding of Speech Spectrum Parameters over an AWGN Channel Using Gaussian Mixture Models," IEEE International Conference on Communications, Paris, France, Vol. 5, Pages: 2847 - 2851, June 2004
E. Duni, A. D. Subramaniam, and B. D. Rao, "Improved Quantization Structures Using Generalized HMM Modeling With Application to Wideband Speech Coding," IEEE International Conference on Acoustics, Speech, and Signal Processing, Pages: 161 - 164, May 2004

2003

A. D. Subramaniam, W. Gardner, and B. D. Rao, "Joint Source-Channel Decoding of Speech Spectrum Parameters over Erasure Channels using Gaussian Mixture models," IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 1, Pages: I-120 - I-123, April 2003
A. D. Subramaniam and B. D. Rao, "PDF Optimized Parametric Vector Quantization of Speech Line Spectral Frequencies," IEEE Transactions on Speech and Audio, Issue 2, Pages: 130-142, March 2003

2002

A. D. Subramaniam, W. R. Gardner and B. D. Rao, "Speech Spectrum Quantization Using Gaussian Mixture Models and Multi Dimensional Companding," IEEE Speech Coding Workshop, Ibaraki, Japan, Pages: 5 - 7, October 2002
W. R. Gardner, A. D. Subramaniam and B. D. Rao, "Comprehensive Evaluation of Theoretical Approximations for Spectral Quantization Performance," European Signal Processing Conference, Toulouse, France, September 2002
A.D. Subramaniam, W.R. Gardner and B. D. Rao, "Low Complexity Recursive Coding of Spectrum Parameters," IEEE International Conference on Acoustics, Speech and Signal Processing, Vol. 1, Pages: 637 -640, May 2002

2001

A. D. Subramaniam and B. D. Rao, "Speech LSF Quantization with Rate Independent Complexity, Bit Scalability and Learning," IEEE International Conference on Acoustics, Speech and Signal Processing, Salt Lake City, Utah, Pages: 705-708, May 2001
S. Dharanipragada and B. D. Rao, "MVDR Based Feature Extraction for Robust Speech Recognition," IEEE International Conference on Acoustics, Speech and Signal Processing, Salt Lake City, Utah, Pages: 309-312, May 2001
A. D. Subramaniam and B. D. Rao, "Source Coding with Minimal and Rate-Independent Search and Memory Complexity," Data Compression Conference, Pages: 518-524, March 2001

2000

A. D. Subramaniam and B. D. Rao, "PDF Optimized Parametric Vector Quantization of Speech Line Spectral Frequencies," IEEE Asilomar Conference on Signals, Systems and Computers, Monterey, California, Vol. 2, Pages: 1475 - 1479, November 2000
A. D. Subramaniam, and B. D. Rao, "PDF Optimized Parametric Vector Quantization of Speech Line Spectral Frequencies," IEEE Workshop on Speech Coding, Delavan, WI, Pages: 87-89, September 2000
M. N. Murthi and B. D. Rao, "All-Pole Modeling of Voiced Speech Base on the Minimum Variance Distortionless Response Spectrum," IEEE Transactions on Speech and Audio Processing, Pages: 221-239, May 2000

1999

M. N. Murthi and B. D Rao, "MVDR Based All-Pole Modeling: Properties, Enhancements, and Comparison," IEEE Workshop on Speech Coding, Pages: 31 -33, June 1999
M. N. Murthi and B. D. Rao, "MVDR Spectrum and Speech Modeling: A Tutorial," Seventh Edition of the DSPtidende published by the Danish Society for Applied Digital Signal Process, May 1999
M. N. Murthi and B. D. Rao, "MVDR Based All-Pole Models for Spectral Coding of Speech," IEEE International Conference on Acoustics, Speech and Signal Processing, Phoenix, AZ, Vol. 2, Pages: 669 - 672, March 1999

1997

M. N. Murthi and B. D. Rao, "All-Pole Modeling of Speech Based on the Minimum Variance Distortionless Response Spectrum," IEEE Asilomar Conference on Signals, Systems and Computers, Monterey, CA, Vol. 2, Pages: 1061-1065, November 1997
M. N. Murthi and B. D. Rao, "Minimum Variance Distortionless Response (MDVR) Modeling of Voiced Speech," IEEE International Conference on Acoustics, Speech and Signal Processing, Munich, Germany, Vol. 3, Pages: 1687 - 1690, April 1997
M. N. Murthi and B. D. Rao, "All-Pole Model Parameter Estimation for Voiced Speech," IEEE Workshop on Speech Coding for Telecommunications Proceedings, Pages: 17-18, 1997
W. R. Gardner and B. D. Rao, "Noncausal All-Pole Modeling of Voiced Speech," IEEE Transactions on Speech and Audio Processing, Vol. 5, No. 1, Pages: 1-10, January 1997

1995

W. R. Gardner and B. D. Rao, "Theoretical Analysis of the high-rate vector Quantization of LPC parameters," IEEE Transactions on Speech and Audio Processing, Vol. 3, Issue: 5, Pages: 367-381, September 1995
W. R. Gardner and B. D. Rao, "Optimal Distortion Measures for the High Rate Vector Quantization of LPC Parameters," IEEE International Conference on Acoustics, Speech and Signal Processing, Detroit, Michigan, Vol. 1, Pages: 752 - 755, May 1995
W. Y. Huang and B. D. Rao, "Channel and Noise Compensation for Text Dependent Speaker Verification over Telephone," IEEE International Conference on Acoustics, Speech and Signal Processing, Detroit, Michigan, Vol. 1, Pages: 337 - 340, May 1995

1994

W. R. Gardner and B. D. Rao, "Analysis of High Rate LPC Vector Quantizers Designed by Minimizing Suboptimal Error Measures," IEEE Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, Vol. 2, Pages: 1232 - 1236, October 1994
W. R. Gardner and B. D. Rao, "Mixed-Phase AR Models for Voiced Speech and Perceptual Cost Functions," Proc of the International Conference on Acoustics, Speech and Signal Processing, Adelaide, Australia, Vol. 1, Pages: 205 - 208, April 1994

1992

W. R. Gardner and B. D. Rao, "Non-Causal Linear Prediction of Voiced Speech," IEEE Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, Pages: 1100-1104, October 1992

1988

S. Dharanipragada, R. A. Gopinath and B. D. Rao, "Techniques for Capturing Temporal Variations in Speech Signals with Fixed-Rate Processing," IEEE International Conference on Speech and Language Processing, Sydney, Australia, November 1988