Proceedings of the 2nd International Conference on Information, Electronics and Computer

Auditory feature for monaural speech segregation

Authors
Yi Jiang, RunshenG Liu, Yuanyuan Zu
Corresponding Author
Yi Jiang
Available Online March 2014.
DOI
https://doi.org/10.2991/icieac-14.2014.16How to use a DOI?
Keywords
gammatone frequency cepstral coefficients (GFCC); monaural speech segregation; binary classification; time-frequency(T-F) unit
Abstract
Monaural speech segregation has been a very challenging problem for speech signal processing. The implication of the ideal binary masks to an auditory mixture has been shown to yield substantial improvements in signal-to-noise-ratio (SNR) and intelligibility. In this paper, we use the time-frequency (T-F) unit level gammatone frequency cepstral coefficients (GFCC) auditory feature to estimate the ideal binary mask for monaural speech segregation. The paper reports the successful attempt to use GFCC as the segregation cue with deep neural networks (DNNs) classifier. Results show that robust performance can be achieved across noisy and reverberant conditions.
Open Access
This is an open access article distributed under the CC BY-NC license.

Download article (PDF)

Proceedings
2nd International Conference on Information, Electronics and Computer
Part of series
Advances in Intelligent Systems Research
Publication Date
March 2014
ISBN
978-90-78677-99-4
ISSN
1951-6851
DOI
https://doi.org/10.2991/icieac-14.2014.16How to use a DOI?
Open Access
This is an open access article distributed under the CC BY-NC license.

Cite this article

TY  - CONF
AU  - Yi Jiang
AU  - RunshenG Liu
AU  - Yuanyuan Zu
PY  - 2014/03
DA  - 2014/03
TI  - Auditory feature for monaural speech segregation
BT  - 2nd International Conference on Information, Electronics and Computer
PB  - Atlantis Press
SN  - 1951-6851
UR  - https://doi.org/10.2991/icieac-14.2014.16
DO  - https://doi.org/10.2991/icieac-14.2014.16
ID  - Jiang2014/03
ER  -