Proceedings of the 2017 International Seminar on Artificial Intelligence, Networking and Information Technology (ANIT 2017)

D-vector based speaker verification system using Raw Waveform CNN

Authors
Jeeweon Jung, Heesoo Heo, Ilho Yang, Sunghyun Yoon, Hyejin Shim, Hajin Yu
Corresponding Author
Jeeweon Jung
Available Online December 2017.
DOI
https://doi.org/10.2991/anit-17.2018.21How to use a DOI?
Keywords
d-vector, speaker verification, raw-audio-CNN
Abstract
In this paper, we propose a d-vector based speaker verification system in which raw-audio-CNN is used as a d-vector extractor instead of a conventional multi-layer perceptron. Because raw-audio-CNN takes raw wave signals as input, traditional acoustic feature extracting methods such as mel-frequency cepstral coefficient and mel-filterbank features are no longer needed. The results of experiments conducted show that raw-audio-CNN can successfully perform functions carried out by traditional acoustic feature extracting methods and outperforms traditional d-vector systems that utilize standard multi-layer perceptron with acoustic features.
Open Access
This is an open access article distributed under the CC BY-NC license.

Download article (PDF)

Cite this article

TY  - CONF
AU  - Jeeweon Jung
AU  - Heesoo Heo
AU  - Ilho Yang
AU  - Sunghyun Yoon
AU  - Hyejin Shim
AU  - Hajin Yu
PY  - 2017/12
DA  - 2017/12
TI  - D-vector based speaker verification system using Raw Waveform CNN
BT  - 2017 International Seminar on Artificial Intelligence, Networking and Information Technology (ANIT 2017)
PB  - Atlantis Press
UR  - https://doi.org/10.2991/anit-17.2018.21
DO  - https://doi.org/10.2991/anit-17.2018.21
ID  - Jung2017/12
ER  -