结合卷积神经网络与混响时间注意力机制的混响抑制

孙兴伟; 李军锋; 颜永红

doi:10.15949/j.cnki.0371-0025.2021.06.043

结合卷积神经网络与混响时间注意力机制的混响抑制

Speech dereverberation method with convolutional neural network and reverberation time attention

摘要

摘要: 提出一种结合卷积神经网络的编解码器模型和混响时间注意力机制的混响抑制算法,该算法通过编解码器模型实现混响抑制,并利用混响时间注意力机制克服混响环境变化对混响抑制效果的影响。该算法在编码器中使用具有不同大小的卷积核来处理混响语音幅度谱,从而获得包含多尺度上下文信息的编码特征;通过引入注意力模块,实现在不同的混响时间环境中选择性地使用不同权重的编码特征生成加权特征;最后,在解码器中使用加权特征来重建混响抑制后的语音信号幅度谱。在模拟和真实的混响环境下,该算法相对于基线系统在语音混响调制能量比上分别取得了0.36 dB和0.66 dB的提升。实验结果表明,该算法可以适应不同混响环境的变化,相对基线系统在真实混响环境下具有更高的鲁棒性。

Abstract: A reverberation suppression algorithm based on convolutional neural network-based encoder-decoder with reverberation time attention mechanism is proposed.The algorithm achieves reverberation suppression through the encoder-decoder model and uses the reverberation time attention mechanism to overcome the effect of different reverberation environments for reverberation suppression performance.In the encoder,the convolutional kernels with different sizes are applied to the reverberant magnitude spectrum to encode the features with multi-scale context information.The attention module is introduced to selectively focus on the encoded features to generate weighted feature under the different reverberation times.The magnitude spectrum of the dereverberated signal is finally reconstructed using the weighted feature in the decoding process.In simulated and real reverberation environments,our proposed method has achieved 0.36 dB and 0.66 dB improvements in the speech reverberation modulation energy ratio compared to the baseline system.Experimental results show that our proposed algorithm can adapt to various reverberation environments and has higher robustness in real reverberation environments.

HTML全文

参考文献(0)

施引文献

资源附件(0)