Effect of ideal ratio mask using different early and late reverberation partition methods on speech recognition performance
-
Graphical Abstract
-
Abstract
In the real world,noise and reverberation can degrade the performance of speech recognition systems.Reverberation in closed space includes the direct sound,early reflections and late reverberation,which have different effects on speech recognition systems.We focus on different methods of dividing the early and late reverberation,and take the early reflections as the target signals,which is used to calculate different ideal ratio masks whose effects on the performance of speech recognition systems are evaluated.Based on this,we estimate the masks using Bidirectional Long Short-Term Memory network(BLSTM) and test their impact on the performance of speech recognition systems.The experimental results show that the ideal ratio masks can reduce the word error rate by about 2.8%using the Abel's method for dividing early reflection and late reverberation.The BLSTM method underestimates the ideal ratio masks and fails to improve the performance of the speech recognition systems.
-
-