Abstract:
A monaural speech enhancement algorithm for open-vocabulary keyword spotting is proposed. The algorithm stores the keyword phoneme information in the text encoding matrix in advance, and adds a phoneme bias module based on the attention mechanism on the basis of the conventional speech enhancement model. This module uses the intermediate features of the speech enhancement model to obtain the phoneme information of the current frame from the text encoding matrix, and integrates it into the subsequent calculation of the speech enhancement model, so that the model can obtain better enhancement performance on the specified keywords. The experimental results in different noise environments show that the proposed method can more effectively suppress the noise of keyword part and better recover the speech details. Meanwhile, the proposed method achieves an 14.3% relative improvement in open-vocabulary keyword spotting compared to conventional speech enhancement method, and an 7.6% relative improvement compared to other text-dependent speech enhancement method.