EI / SCOPUS / CSCD 收录

中文核心期刊

面向语音增强的序贯隐马尔可夫模型时频语音存在概率估计

Time-frequency speech presence probability estimation based on sequential hidden markov model for speech enhancement

  • 摘要: 语音存在概率的估计是语音增强的核心技术之一,针对传统的存在概率估计方法是启发式的,没有把存在概率的估计统一到一个理论框架之中,不能保证估计最优,提出了一种基于序贯隐马尔可夫模型(SHMM)的存在概率估计方法,在每一子带上构建一个SHMM模型描述对数功率谱包络的时间序列,把谱包络序列看作一个在语音和噪声状态之间转移的动态一阶马尔可夫链,采用单高斯函数构建每一状态的概率模型,语音状态的后验概率即为语音信号的存在概率。为了满足算法实时性要求,SHMM参数估计简化为一阶回归过程,根据极大似然准则逐帧更新模型参数。实验表明:SHMM所描述的时序相关性对存在概率的估计起到关键作用,它优于一般的启发式估计方法;SHMM算法的语音增强分段信噪比(SegSNR)和对数谱失真(LSD)性能优于经典的改进型最小统计量控制递归平均(IMCRA)算法。

     

    Abstract: Speech presence probability (SPP) estimation is a challenging issue on speech enhancement. Traditional methods for SPP is heuristic somewhat. They are not unified into a theoretical framework, which can't enable the optimal estimation. We present a sequential hidden Markov model (SHMM) to describe the log-power sequence as a dynamic process that transits between speech and noise states. The emission probability of each state is modeled by a Gaussian function. SPP is represented as the posterior probability of speech states given the observed log-power sequence. To meet the requirement of real-time capability, SHMM parameter estimation is simplified to a first-order recursive process, where the model parameter set is updated frame by frame on the basis of maximum likelihood. The comparison between several modeling methods showed the superiority of SHMM in modeling temporal correlation. The speech enhancement experiments confirm constrained SHMM outperforms conventional Minima Controlled Recursive Averaging (IMCRA) in terms of segmental SNR and log spectral distortion.

     

/

返回文章
返回