Non-negative Tensor Factorization for Speech Enhancement
- 10.2991/icaita-16.2016.5How to use a DOI?
- non-negative tensor factorization (NTF); speech enhancement; sound source separation
This paper proposes an algorithm for speech enhancement by non-negative tensor factorisation. We group adjacent time-frequency matrices in the spectrograms together to form a tensor as a basic input in our algorithm. The non-negative tensor factorisation is followed to perform sound source separation between speeches and noises. The proposed strategy benefits from both short time spectral analysis and long term information. From the consideration of auditory theory and linguistics, the latter preserves the temporal dynamics information and intrinsic structure of speech, which are important for the continuity and integrity of hearing. We collected several types of real-life noises and conducted experiments on the TIMIT database. Experimental results demonstrated that the segmental signal to noise ratio (SSNR) and the perceptual evaluation of speech quality (PESQ) were significantly improved respectively.
- © 2016, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Liang He AU - Weiqiang Zhang AU - Mengnan Shi PY - 2016/01 DA - 2016/01 TI - Non-negative Tensor Factorization for Speech Enhancement BT - Proceedings of the 2016 International Conference on Artificial Intelligence: Technologies and Applications PB - Atlantis Press SP - 18 EP - 22 SN - 1951-6851 UR - https://doi.org/10.2991/icaita-16.2016.5 DO - 10.2991/icaita-16.2016.5 ID - He2016/01 ER -