采用归一化补偿变换的与文本无关的说话人识别
Text-independent speaker recognition using normalization compensation transformation
-
摘要: 在噪声环境下,特别是当说话人识别最常用的模型——高斯混合模型(GMM)失配的情况下,需要对其输出帧似然概率的统计特性进行补偿。文章根据说话人识别的声学特性,提出了一种非线性变换方法——归一化补偿变换。理论分析和实验结果表明:与常用的最大似然(ML)变换相比,该变换能够提高系统识别率,最大可达3.7%,同时可降低误识率,最大可达45.1%。结果说明归一化补偿变换方法基本克服了在与文本无关说话人识别系统中,当说话人的个性特征不断变化、语音与噪声不能很好地分离或者降噪算法对语音有损伤、模型不能很好地匹配时,需要对模型输出的似然概率(得分)进行补偿的局限。这也说明对模型输出的似然概率进行处理是降低噪声和干扰的影响、提高说话人识别率的有效方法。Abstract: Based on the acoustic characteristic of frame likelihood probability output by Gaussian Mixture Model (GMM) which was the best text-independent speaker recognition model,normalization compensation transformation as a non-llnear transform method was presented.The theory analysis and experiment showed that it could improve recognition ratio 3.7% and reduce the error recognition ratio 45.1% as compared with Maximum-Likelihood (ML) transformation.The result showed:normalization compensation transformation should be adopted for cancelling the influence of variations in speech characteristics,noise and model mismatch;Process on frame likelihood probability output by GMM is effectual way of decreasing the influence of noise and improving the recognition ratio.