EI / SCOPUS / CSCD 收录

中文核心期刊

采用L1/2稀疏约束的梅尔倒谱系数语音重建方法

Speech reconstruction from Mel-frequency cepstral coefficients via L1/2 sparse constraint

  • 摘要: 提出了一种利用L1/2稀疏约束从梅尔倒谱系数重建语音时域信号方法。从梅尔倒谱系数估计语音幅度谱是一个欠定问题,现有的方法均采用幅度谱最小均方误差估计或采用L1正则化进行幅度谱的稀疏约束。相比于L1正则化模型,L1/2的稀疏约束特性更强,为此,本文在从梅尔倒谱系数估计语音幅度谱时引入L1/2正则化约束,并利用求解的稀疏幅度谱估计相位谱,最后利用估计的频谱重建时域语音信号。实验结果表明,与幅度谱最小均方误差法相比,本文算法所估计出的语音信号具有更高的语音质量;在噪声环境下进行语音重建实验,与L1正则化幅度谱估计方法相比,本文算法重建的语音质量更好,表现出更好抗噪性。

     

    Abstract: Reconstruction the time domain speech signal from Mel-frequency cepstral coefficients (MFCCs) based on L1/2 sparse constraint is proposed. Since estimating the speech amplitude spectrum from MFCCs is an underdetermined problem, existing methods usually adopt either minimum mean square error minimization of the amplitude spectrum or the L1 regularization based sparse constraint to estimate the amplitude spectrum. Compared to the L1 regularization, the L1/2 regularization has stronger ability to obtain the sparse components of a speech signal. Thus, we use L1/2 regularization constraint when estimating amplitude spectrum from MFCCs in the proposed method. The phase spectrum is estimated from the estimated sparse amplitude spectrum. Finally the time domain speech signal is reconstructed from the estimated spectrum. Experimental results show that the speech signal reconstructed by the proposed method gains higher speech quality than that by the minimum mean square error method. Specifically, the proposed method outperforms the L1 regularization method in the aspect of speech quality under the noise environment, indicating noise robustness of the proposed method.

     

/

返回文章
返回