Proceedings of the AASRI Winter International Conference on Engineering and Technology (AASRI-WIET 2013)

A Telephone Speech Corpus of China’s Minority languages for Automatic Language Identification

Authors
Xiuhua Zeng, Jian Yang, Libo Zuo, Yonghua Xu
Corresponding Author
Xiuhua Zeng
Available Online December 2013.
DOI
10.2991/wiet-13.2013.47How to use a DOI?
Keywords
Language identification; Telephone speech; Corpus; Minority languages
Abstract

Research in language identification require corpus of multi-languages speech data to capture the distinguishable information within and across languages. In the past few decades, many statistical approaches to language identification have been developed based on two common and public-domain corpora which consist of telephone speech from about 26 languages and dialects. However, the China's minority languages have not been used as the target languages in the published papers up to now. In our work, we select 9 typical China’s minority languages and Mandarin to construct our telephone speech corpus. These minority languages are composed of Naxi, Miao, Bai, Dai, Yi, Zhuang, Uygur language, Mongolian and Tibetan. Each minority language represents its minority nationality. The corpus can be used to study, develop, evaluate and compare minority languages identification algorithms. Moreover, it will promote the Linguistic researchers to pay more attention to the long history and splendid culture of our national minorities.

Copyright
© 2013, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the AASRI Winter International Conference on Engineering and Technology (AASRI-WIET 2013)
Series
Advances in Intelligent Systems Research
Publication Date
December 2013
ISBN
10.2991/wiet-13.2013.47
ISSN
1951-6851
DOI
10.2991/wiet-13.2013.47How to use a DOI?
Copyright
© 2013, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Xiuhua Zeng
AU  - Jian Yang
AU  - Libo Zuo
AU  - Yonghua Xu
PY  - 2013/12
DA  - 2013/12
TI  - A Telephone Speech Corpus of China’s Minority languages for Automatic Language Identification
BT  - Proceedings of the AASRI Winter International Conference on Engineering and Technology (AASRI-WIET 2013)
PB  - Atlantis Press
SP  - 198
EP  - 201
SN  - 1951-6851
UR  - https://doi.org/10.2991/wiet-13.2013.47
DO  - 10.2991/wiet-13.2013.47
ID  - Zeng2013/12
ER  -