EI / SCOPUS / CSCD 收录

中文核心期刊

LI Xueli, DING Hui, XU Boling. Entropy-based initial/final segmentation for Chinese whiskered speech[J]. ACTA ACUSTICA, 2005, 30(1): 69-75. DOI: 10.15949/j.cnki.0371-0025.2005.01.011
Citation: LI Xueli, DING Hui, XU Boling. Entropy-based initial/final segmentation for Chinese whiskered speech[J]. ACTA ACUSTICA, 2005, 30(1): 69-75. DOI: 10.15949/j.cnki.0371-0025.2005.01.011

Entropy-based initial/final segmentation for Chinese whiskered speech

More Information
  • PACS: 
  • Received Date: April 03, 2003
  • Revised Date: July 16, 2003
  • Available Online: July 22, 2022
  • The Initial/Final(IF) segmentation of whispered speech is the pre-processing in the whispered speech recognition and the reconstruction of normal speech from whisper. However, because the whispered initials and finals are all unvoiced, it is difficult to segment them by the methods used in the normal speech. With tile characteristics analysis of Chinese whispered speech, a new segmentation method is proposed. The speech endpoint is detected by the entropy function, and the initial/final boundary is obtained by the decision of the initial duration, the symmetric relative entropy and the normalized spectral center of gravity. The correct segmentation rates are 87.9% for the female data and 90.3% for the male data in the test with 380 Chinese whispered syllables at 2-10 dB SNR. It is more accuracy than the frequency domain method, the clustering method and the spectral flatness method. As shown in the experiments, this algorithm can be used as pre-processing in the whispered speech recognition and the conversion. It gives the reconstructed speech a more natural quality.
  • Related Articles

    [1]CHEN Xueqin, ZHAO Heming. Research of whispered speech vocal tract system conversion based on universal background model and effective Gaussian components[J]. ACTA ACUSTICA, 2013, 38(2): 195-200. DOI: 10.15949/j.cnki.0371-0025.2013.02.008
    [2]LI Hao, TANG Chaojing. Initial/final segmentation using loss function and acoustic features[J]. ACTA ACUSTICA, 2012, 37(3): 339-345. DOI: 10.15949/j.cnki.0371-0025.2012.03.010
    [3]CHEN Bin, ZHANG Lianhai, WANG Bo, QU Dan. Boundary detection of Chinese initials and finals based on seneff's auditory spectrum features[J]. ACTA ACUSTICA, 2012, 37(1): 104-112. DOI: 10.15949/j.cnki.0371-0025.2012.01.012
    [4]ZHANG Baoqi, ZHANG Lianhai, QU Dan. Segmentation of Chinese initials and finals based on auditory event detection[J]. ACTA ACUSTICA, 2010, 35(6): 701-707. DOI: 10.15949/j.cnki.0371-0025.2010.06.013
    [5]SHAO Jian, ZHAO Qingwei, YAN Yonghong. Initial/final acoustic model based on separating nasal coda in Chinese Putonghua speech recognition[J]. ACTA ACUSTICA, 2010, 35(5): 587-592. DOI: 10.15949/j.cnki.0371-0025.2010.05.021
    [6]WANG Min, ZHAO Heming. Whispered speaker identification based on multiband demodulation analysis and instantaneous frequency estimation[J]. ACTA ACUSTICA, 2010, 35(4): 471-476. DOI: 10.15949/j.cnki.0371-0025.2010.04.014
    [7]TAO Zhi, ZHAO Heming, WU Di, CHEN Daqing, ZHANG Xiaojun. Speech enhancement based on modified Mel masking model and speech absence probability in whispers[J]. ACTA ACUSTICA, 2009, 34(4): 370-377. DOI: 10.15949/j.cnki.0371-0025.2009.04.010
    [8]ZHAO Li. Study on the Chinese continuous speech recognition under noise environments based on segmental unit input HMM[J]. ACTA ACUSTICA, 2002, 27(1): 59-61. DOI: 10.15949/j.cnki.0371-0025.2002.01.011
    [9]ZHANG Hong, HUANG Taiyi, LI Zhi. Segmentation of speech signal based on the half-wave differential spectrum[J]. ACTA ACUSTICA, 2000, 25(4): 323-328. DOI: 10.15949/j.cnki.0371-0025.2000.04.007
    [10]HAN Jiang, YIN Baolin. A model for speech recognition based on joint modeling of frame-based and segmental features[J]. ACTA ACUSTICA, 2000, 25(2): 182-190. DOI: 10.15949/j.cnki.0371-0025.2000.02.016

Catalog

    Article Metrics

    Article views (43) PDF downloads (7) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return