Proceedings of the 2015 International Conference on Education, Management and Computing Technology

Query-by-example spoken term detection based on phonetic posteriorgram

Authors
Beili Song, Weiqiang Zhang, Meng Cai, Jia Liu, Michael T. Johnson
Corresponding Author
Beili Song
Available Online June 2015.
DOI
10.2991/icemct-15.2015.256How to use a DOI?
Keywords
query-by-example; spoken term detection; softmax output features; dynamic time warping.
Abstract

Spoken term detection in low-resource situations is a challenging problem, because traditional large vocabulary continuous speech recognition (LVCSR) approaches are often unusable. This paper introduces a method to use deep neural network (DNN) softmax outputs as input features in a query-by-example (QBE) spoken term detection (STD) system. Matches between queries and test utterances are located using a modified dynamic time warping (DTW) search approach. Subsystems are built with unsupervised Gaussian mixture model (GMM) and DNN monophone models trained on Chinese and English languages and evaluated on the SWS 2013 multilingual database of low-resource languages. The score-level fusion of these different subsystems are shown to improve performance significantly over the baseline results.

Copyright
© 2015, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 2015 International Conference on Education, Management and Computing Technology
Series
Advances in Social Science, Education and Humanities Research
Publication Date
June 2015
ISBN
978-94-62520-82-0
ISSN
2352-5398
DOI
10.2991/icemct-15.2015.256How to use a DOI?
Copyright
© 2015, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Beili Song
AU  - Weiqiang Zhang
AU  - Meng Cai
AU  - Jia Liu
AU  - Michael T. Johnson
PY  - 2015/06
DA  - 2015/06
TI  - Query-by-example spoken term detection based on phonetic posteriorgram
BT  - Proceedings of the 2015 International Conference on Education, Management and Computing Technology
PB  - Atlantis Press
SP  - 1251
EP  - 1256
SN  - 2352-5398
UR  - https://doi.org/10.2991/icemct-15.2015.256
DO  - 10.2991/icemct-15.2015.256
ID  - Song2015/06
ER  -