[1] |
李轶南, 张雄伟, 贾冲, 陈亮, 曾理. 稀疏低秩噪声模型下无监督实时单通道语音增强算法. 声学学报, 2015; 40(4):607-614
|
[2] |
Wang Y X, Wang D L. Towards scaling up classification-based speech separation. IEEE Trans. Audio Speech Lang. Process., 2013; 21(7):1381-1390
|
[3] |
Xu Y, Du J, Dai L et al. A regression approach to speech enhancement based on deep neural networks. IEEE/ACM Trans. Audio Speech Lang. Process., 2015; 23(1):7-19
|
[4] |
张晓艳, 张天骐, 葛宛营, 白杨柳. 联合深度神经网络和凸优化的单通道语音增强算法. 声学学报, 2021; 46(3):471-480
|
[5] |
Park S R, Lee J W. A fully convolutional neural network for speech enhancement. Interspeech 2017, Stockholm, Sweden, International Speech Communication Association, 2017:1993-1997
|
[6] |
Rethage D, Pons J, Serra X. A wavenet for speech denoising. Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Calgary, AB, Canada, IEEE, 2018:5069-5073
|
[7] |
Weninger F, Hershey J R, Roux J L et al. Discriminatively trained recurrent neural networks for single-channel speech separation. 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP), Atlanta, GA, USA, IEEE, 2014:577-581
|
[8] |
Macartney C, Weyde T. Improved speech enhancement with the wave-u-net. arXiv:181111307, 2018
|
[9] |
Yin D, Luo C, Xiong Z et al. Phasen:A phase-and-harmonics-aware speech enhancement network. Proceedings of the AAAI Conference on Artificial Intelligence, 2020; 34(5):9458-9465
|
[10] |
Defossez A, Synnaeve G, Adi Y. Real time speech enhancement in the waveform domain. arXiv:200612847, 2020
|
[11] |
Weninger F, Erdogan H, Watanabe S et al. Speech enhancement with lstm recurrent neural networks and its application to noise-robust asr. Latent Variable Analysis and Signal Separation, Cham, Springer International Publishing, 2015:91-99
|
[12] |
Pascual S, Bonafonte A, Serrà J. Segan:Speech enhancement generative adversarial network. Interspeech 2017, Stockholm, Sweden, International Speech Communication Association, 2017:3642-3646
|
[13] |
Shamma S, Fritz J. Adaptive auditory computations. Curr. Opin. Neurobiol., 2014; 25:164-168
|
[14] |
Zion Golumbic E, Ding N, Bickel S et al. Mechanisms underlying selective neuronal tracking of attended speech at a "cocktail party". Neuron, 2013; 77(5):980-991
|
[15] |
Pandey A, Wang D. Densely connected neural network with dilated convolutions for real-time speech enhancement in the time domain. Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Barcelona, Spain, IEEE, 2020:6629-6633
|
[16] |
Kim J, EL-Khamy M, Lee J. T-gsa:Transformer with gaussian-weighted self-attention for speech enhancement. Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Barcelona, Spain, IEEE, 2020:6649-6653
|
[17] |
Wang K, He B, Zhu W-P. Tstnn:Two-stage transformer based neural network for speech enhancement in the time domain. Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Toronto, ON, Canada, IEEE, 2021:7098-7102
|
[18] |
Li P, Jiang Z, Yin S et al. Pagan:A phase-adapted generative adversarial networks for speech enhancement. Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Barcelona, Spain, IEEE, 2020:6234-6238
|
[19] |
Jia H, Wang W, Mei S. Combining adaptive sparse nmf feature extraction and soft mask to optimize dnn for speech enhancement. Appl. Acoust., 2021; 171:107666
|
[20] |
Liao C-F, Tsao Y, Lee H-Y et al. Noise adaptive speech enhancement using domain adversarial training. Interspeech 2019, Graz, Austria, International Speech Communication Association, 2019:3148-3152
|
[21] |
Hou J, Zhao S. A real-time speech enhancement algorithm based on convolutional recurrent network and wiener filter. 2021 IEEE 6th International Conference on Computer and Communication Systems (ICCCS), Chengdu, China, IEEE, 2021:683-688
|
[22] |
Shu X, Zhou Y, Cao Y. A progressive enhancement method for noisy and reverberant speech. 2018 IEEE 23rd International Conference on Digital Signal Processing (DSP), Shanghai, China, IEEE, 2018:1-5
|
[23] |
Hao X, Su X, Wen S et al. Masking and inpainting:A two-stage speech enhancement approach for low snr and non-stationary noise. Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Barcelona, Spain, IEEE, 2020:6959-6963
|
[24] |
Chiluveru S R, Tripathy M. Nonstationary noise reduction in low snr speech signals with wavelet coefficient feature. 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT), Tirunelveli, India, IEEE, 2020:647-653
|
[25] |
Zhao B, Li Q, Lv Q et al. A spectrum adaptive segmentation empirical wavelet transform for noisy and nonstationary signal processing. IEEE Access, 2021; 9:106375-106386
|
[26] |
Zão L, Coelho R. On the estimation of fundamental frequency from nonstationary noisy speech signals based on the hilbert-huang transform. IEEE Signal Process. Lett., 2018; 25(2):248-252
|
[27] |
Medina C, Coelho R, Zão L. Impulsive noise detection for speech enhancement in hht domain. IEEE/ACM Trans. Audio Speech Lang. Process., 2021; 29:2244-2253
|
[28] |
Xu Z, Jiang T, Li C et al. An attention-augmented fully convolutional neural network for monaural speech enhancement. 202112th International Symposium on Chinese Spoken Language Processing (ISCSLP), Hong Kong, IEEE, 2021:1-5
|
[29] |
Pandey A, Wang D. Dense cnn with self-attention for time-domain speech enhancement. IEEE/ACM Trans. Audio Speech Lang. Process., 2021; 29:1270-1279
|
[30] |
Hwang J W, Park R H, Park H M. Efficient audio-visual speech enhancement using deep u-net with early fusion of audio and video information and rnn attention blocks. IEEE Access, 2021; 9:137584-137598
|
[31] |
Stoller D, Ewert S, Dixon S. Wave-u-net:A multi-scale neural network for end-to-end audio source separation. 19th International Society for Music Information Retrieval Conference (ISMIR 2018)
|
[32] |
常新旭, 张杨, 杨林等. 融合多头自注意力机制的语音增强方法. 西安电子科技大学学报, 2020; 47(1):104-110
|
[33] |
Liu G, Gong K, Liang X et al. Cp-gan:Context pyramid generative adversarial network for speech enhancement. Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Barcelona, Spain, IEEE, 2020:6624-6628
|
[34] |
Dauphin Y N, Fan A, Auli M et al. Language modeling with gated convolutional networks. Proceedings of the 34th International Conference on Machine Learning- Volume 70, Sydney, NSW, Australia, JMLR.org, 2017:933-941
|
[35] |
Chen J, Mao Q, Liu D. Dual-path transformer network:Direct context-aware modeling for end-to-end monaural speech separation. Interspeech2020, Shanghai, China, International Speech Communication Association, 2020:2642-2646
|
[36] |
Luo Y, Chen Z, Yoshioka T. Dual-path rnn:Efficient long sequence modeling for time-domain single-channel speech separation. Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Barcelona, Spain, IEEE, 2020:46-50
|
[37] |
Vaswani A, Shazeer N, Parmar N et al. Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, California, USA, Curran Associates Inc., 2017:6000-6010
|
[38] |
Martin-Doñas J M, Gomez A M, Gonzalez J A et al. A deep learning loss function based on the perceptual evalu-[] ation of the speech quality. IEEE Signal Process. Lett., 2018; 25(11):1680-1684
|
[39] |
Valentini-Botinhao C, Wang X, Takaki S et al. Investigating rnn-based speech enhancement methods for noise-robust text-to-speech. 9th ISCA Speech Synthesis Workshop, 2016:146-152
|
[40] |
Veaux C, Yamagishi J, King S. The voice bank corpus:Design, collection and data analysis of a large regional accent speech database. 2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), Gurgaon, India, IEEE, 2013:1-4
|
[41] |
Thiemann J, Ito N, Vincent E. The diverse environments multi-channel acoustic noise database (demand):A database of multichannel environmental noise recordings. J. Acoust. Soc. Am., 2013; 133(5):3591
|
[42] |
Reddy C K A, Dubey H, Gopal V et al. Icassp 2021 deep noise suppression challenge. Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Toronto, ON, Canada, IEEE, 2021:6623-6627
|
[43] |
徐岩, 王春丽. 语音信号增强技术及其应用. 北京:科学出版社, 2013:195-229
|
[44] |
Hu Y, Loizou P C. Evaluation of objective quality measures for speech enhancement. IEEE Trans. Audio Speech Lang. Process., 2008; 16(1):229-238
|
[45] |
Https://GITHUB.COM/IMLHF/SPEECH-ENHANCEM-ENT-MEASURES
|
[46] |
Takahashi N, Agrawal P, Goswami N et al. Phasenet:Discretized phase modeling with deep neural networks for audio source separation. Interspeech 2018, Hyderabad, India, International Speech Communication Association, 2018:2713-2717
|
[47] |
Soni M H, Shah N, Patil H A. Time-frequency masking-based speech enhancement using generative adversarial network. Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Calgary, AB, Canada, IEEE, 2018:5039-5043
|
[48] |
Fu S-W, Liao C-F, Tsao Y et al. Metricgan:Generative adversarial networks based black-box metric scores optimization for speech enhancement. International Conference on Machine Learning, PMLR, 2019:2031-2041
|
[49] |
Choi H-S, Kim J-H, Huh J et al. Phase-aware speech enhancement with deep complex u-net. ICLR 2019 Conference, New Orleans, Louisiana, United States, 2019
|