EI / SCOPUS / CSCD 收录

中文核心期刊

面向混响环境声源定位的Transformer特征聚焦建模方法

Transformer feature focusing modeling method for sound source localization in reverberant environments

  • 摘要: 针对传统声源定位算法在混响环境中易受多路径反射干扰、定位精度和鲁棒性不足的问题, 提出了一种面向混响环境声源定位的Transformer特征聚焦建模方法。该方法以波束形成归一化功率图为输入, 利用Transformer的多头注意力机制对不同空间位置特征进行建模, 强化与真实声源位置相关的全局与局部特征表征, 从而抑制混响条件下由反射引起的虚假声源干扰。为验证所提方法的有效性, 采用镜像源法构建不同声源位置和不同混响条件下的仿真数据集, 并以真实声源位置作为监督标签开展训练与测试。结果表明, 与传统去混响方法及早期深度学习方法相比, 所提方法在混响环境下具有更高的定位精度和更好的鲁棒性, 尤其在高混响条件下仍能保持较好的定位性能。同时, 该方法能够有效降低旁瓣水平, 减弱虚假声源的影响。封闭空间实验进一步验证了该方法在实际场景中的有效性。

     

    Abstract: Traditional sound source localization algorithms in reverberant environments are susceptible to multipath reflections, resulting in insufficient localization accuracy and robustness. To address these issues, a Transformer feature-focused modeling method for sound source localization in reverberant environments is proposed. Using the normalized beamforming power map as input, the proposed method employs the multi-head attention mechanism of the Transformer to model features at different spatial positions, thereby enhancing the global and local feature representations associated with the true source location and suppressing spurious source interference caused by reflections under reverberant conditions. To verify the effectiveness of the proposed method, simulation datasets under different source positions and reverberation conditions are generated using the image source method, and the true source positions are used as supervision labels for training and testing. The results show that, compared with traditional dereverberation methods and early deep learning methods, the proposed method achieves higher localization accuracy and better robustness in reverberant environments, and maintains favorable localization performance especially under highly reverberant conditions. In addition, the proposed method effectively reduces sidelobe levels and weakens the influence of spurious sources. Experiments conducted in an enclosed space further validate the effectiveness of the proposed method in practical scenarios.

     

/

返回文章
返回