Inventi Impact: Audio, Speech & Music Processing
ROBUST TIME DELAY ESTIMATION FOR SPEECH SIGNALS USING INFORMATION THEORY: A COMPARISON STUDY
Fei Wen, Qun Wan
Time delay estimation (TDE) is a fundamental subsystem for a speaker localization and tracking system. Most of the
traditional TDE methods are based on second-order statistics (SOS) under Gaussian assumption for the source. This
article resolves the TDE problem using two information-theoretic measures, joint entropy and mutual information
(MI), which can be considered to indirectly include higher order statistics (HOS). The TDE solutions using the two
measures are presented for both Gaussian and Laplacian models. We show that, for stationary signals, the two
measures are equivalent for TDE. However, for non-stationary signals (e.g., noisy speech signals), maximizing MI
gives more consistent estimate than minimizing joint entropy. Moreover, an existing idea of using modified MI to
embed information about reverberation is generalized to the multiple microphones case. From the experimental
results for speech signals, this scheme with Gaussian model shows the most robust performance in various noisy
and reverberant environments.
CC Compliant Citation: Wen and Wan: Robust time delay estimation for speech signals using information theory: A comparison
study. EURASIP Journal on Audio, Speech, and Music Processing 2011 2011:3. doi:10.1186/1687-4722-2011-3.