汉语连续语音识别中一种新的音节间相关识别单元
A new acoustic modeling of inter-syllable context-dependent units for Putonghua continuous speech recognition
-
摘要: 考虑汉语连续语音中的协同发音现象对语音识别性能的提高是非常重要的。针对汉语语音的特点,提出了一种新的在汉语连续语音识别中考虑音节间协同发音现象,对声学模型进行细化的识别单元。然后基于语音学知识对音节间上下文影响进行分类,实现单元间状态参数的共享,降低了模型的复杂程度,保证了模型的可训练度。这种方法和传统方法的最大不同在于:这种方法完全利用语音学知识进行聚类,而传统方法采用数据驱动的聚类方式。识别实验表明,基于语音学分类的音节间相关识别单元对识别性能有明显的改善,系统的首选误识率降低了17%。Abstract: To capture the coarticulatory effects in Putonghua continuous speech is important to improve the performance of automatic speech recognition system. A new acoustic modeling technique to construct inter-syllable context-dependent units is proposed, which is based on some particular characteristics of Putonghua. The acoustic model is detailed and context-dependent units are formed after phonetic coarticulation between neighboring syllables is taken into account. Then various contextual influences between syllables are classified based on Putonghua phonetic knowledge. This phonetic classification makes sharing parameters across different units possible, which can significantly reduce the complexity of acoustic model and construct a trainable model. Compared with traditional parameter-sharing techniques, this one is purely based on phonetics, instead of acoustical data-driven clustering. Experimental results show that this technique can significantly improve system performance. The proposed method reduces error rate by 17%.