Mask estimation method in the spherical harmonic domain used by adaptive beamforming for speech enhancement

KE Yuxuan; LI Jian; PENG Renhua; ZHENG Chengshi; LI Xiaodong

doi:10.15949/j.cnki.0371-0025.2021.01.007

KE Yuxuan, LI Jian, PENG Renhua, ZHENG Chengshi, LI Xiaodong. Mask estimation method in the spherical harmonic domain used by adaptive beamforming for speech enhancement[J]. ACTA ACUSTICA, 2021, 46(1): 67-80. DOI: 10.15949/j.cnki.0371-0025.2021.01.007

Citation:

Mask estimation method in the spherical harmonic domain used by adaptive beamforming for speech enhancement

Graphical Abstract

Graphical Abstract

Abstract

Abstract

A mask estimation method for adaptive beamforming for spherical microphone arrays is proposed which at first extracts the low-dimensional spatial vector containing spatial information from the spherical harmonic coefficients of the received signals,and then employs a Complex Gaussian Mixture Model(CGMM) or a deep learning network to estimate the mask.Finally,the estimate mask is used to design the Minimum Variance Distortionless Response(MVDR) beamformer,so that the directional interferences can be suppressed.The simulation results show that the computational complexity of the proposed method is one-level magnitude lower than the conventional method processing in microphone domain,and the corresponding MVDR beamformer can achieve much better performance in terms of Perceptual Evaluation of Speech Quality(PESQ),segmental Signal-to-Noise Ratio(segSNR),and Short-Time Objective Intelligibility(STOI) in most acoustic scenarios,especially when the Signal-to-Noise Ratio(SNR) is relatively low.The maximal improvement of that three objective metrics are about 1.31 dB,4.54 dB and 35%,respectively.In addition,the experiments conducted in real acoustic environment indicate that the proposed method can achieve more noise reduction amount than the conventional method without impacting the speech intelligibility.

FullText(HTML)

References (0)

Cited By

Mask estimation method in the spherical harmonic domain used by adaptive beamforming for speech enhancement

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content