EI / SCOPUS / CSCD 收录

中文核心期刊

CAI Shang, JIN Xin, GAO Shengxiang, PAN Jielin, YAN Yonghong. Sub-band power normalized perceptual linear predictive coefficients for robust automatic speech recognition[J]. ACTA ACUSTICA, 2012, 37(6): 667-672. DOI: 10.15949/j.cnki.0371-0025.2012.06.014
Citation: CAI Shang, JIN Xin, GAO Shengxiang, PAN Jielin, YAN Yonghong. Sub-band power normalized perceptual linear predictive coefficients for robust automatic speech recognition[J]. ACTA ACUSTICA, 2012, 37(6): 667-672. DOI: 10.15949/j.cnki.0371-0025.2012.06.014

Sub-band power normalized perceptual linear predictive coefficients for robust automatic speech recognition

  • In order to improve the noise robustness of perceptual linear predictive (PLP) coefficients,one kind of features called sub-band power normalized perceptual linear predictive (SPNPLP) coefficients using power bias subtraction is presented.PLP captures the most useful information of speech and fits well with the assumptions used in hidden Markov models.Automatic speech recognition (ASR) systems with PLP have obtained satisfactory performance in benign environments.Nevertheless,performance of ASR drops dramatically in noisy environments.In this work,power bias subtraction that suppresses background excitation is introduced to normalize the sub-band power of PLP,and SPNPLP is proposed to increase the robustness of ASR against additive background noise.Recognition performances are evaluated on an isolated-word recognition task with 501 items and a large vocabulary continuous speech recognition(LVCSR) task.The average improvements upon the standard PLP are 11.26 and 9.2 respectively on these two tasks. The experimental results show that the proposed SPNPLP is consistently more robust than PLP.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return