A Tool for Cutting Large Speech Corpora: HCI4CS
Lin Guo, Yang Bai, Jie Su, Wenlin Pan, Tianjun Zhang
Available Online January 2016.
- https://doi.org/10.2991/icaita-16.2016.79How to use a DOI?
- primi speech; human-computer interaction; speech segmentation
- There are two methods to cut large speech corpora, include traditional manual segmentation and machine automatic segmentation. The quality of segmentation can be controlled easily using traditional manual segmentation. However, the shortcomings of manual segmentation were also obviously such as inefficiency, high cost. As we all know, the method of machine automatic segmentation has the advantage of high efficiency, but the fussy work to find cutting error can’t be omitted. Thus, this paper developed a tool of human-computer interaction for cutting speech corpora (HCI4CS), which provides segment algorithm, parameter to control, modifying the error of automatic segmentation results and generates labeling files for HTK toolkit. The research object was one thousand speeches of Primi. Using HCI4CS, a person with low cognitive competence about cutting speech corpora can achieve nearly one hundred percent accuracy.
- Open Access
- This is an open access article distributed under the CC BY-NC license.
Cite this article
TY - CONF AU - Lin Guo AU - Yang Bai AU - Jie Su AU - Wenlin Pan AU - Tianjun Zhang PY - 2016/01 DA - 2016/01 TI - A Tool for Cutting Large Speech Corpora: HCI4CS BT - Proceedings of the 2016 International Conference on Artificial Intelligence: Technologies and Applications PB - Atlantis Press SP - 320 EP - 324 SN - 1951-6851 UR - https://doi.org/10.2991/icaita-16.2016.79 DO - https://doi.org/10.2991/icaita-16.2016.79 ID - Guo2016/01 ER -