Proceedings of the 2016 2nd Workshop on Advanced Research and Technology in Industry Applications

Research On Prosody Conversion of Affective Speech Based on LIBSVM and PAD Three Dimensional Emotion Model

Authors
Xiaoyong Lu, Tao Pan
Corresponding Author
Xiaoyong Lu
Available Online May 2016.
DOI
10.2991/wartia-16.2016.1How to use a DOI?
Keywords
PAD emotion model, five-scale tone model, Library for Support Vector Machines LIBSVM support vector regression, generalized regression neural network, Prosody Conversion.
Abstract

This paper proposes a framework for prosody conversion of emotional speech based on LIBSVM support vector regression model and PAD three dimensional emotion model. We design an emotional speech corpus including 11 kinds of emotional utterances. Each utterance is labeled the emotional information with PAD value. A five-scale tone model is employed to model the pitch contour of emotional speech at the syllable level. A LIBSVM SVR-based prosody conversion model is proposed to realize the transformation of pitch contour, duration and pause duration of emotional speech according to the PAD values of emotion and context information of text. Speech is then re-synthesized with the STRAIGHT algorithm by modifying pitch contour, duration and pause duration, and is compared with the results obtained by the generalized regression neural network. Experimental results show that the modified speech achieves 3.8 of average Emotional Mean Opining Score (EMOS).

Copyright
© 2016, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 2016 2nd Workshop on Advanced Research and Technology in Industry Applications
Series
Advances in Engineering Research
Publication Date
May 2016
ISBN
10.2991/wartia-16.2016.1
ISSN
2352-5401
DOI
10.2991/wartia-16.2016.1How to use a DOI?
Copyright
© 2016, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Xiaoyong Lu
AU  - Tao Pan
PY  - 2016/05
DA  - 2016/05
TI  - Research On Prosody Conversion of Affective Speech Based on LIBSVM and PAD Three Dimensional Emotion Model
BT  - Proceedings of the 2016 2nd Workshop on Advanced Research and Technology in Industry Applications
PB  - Atlantis Press
SP  - 1
EP  - 7
SN  - 2352-5401
UR  - https://doi.org/10.2991/wartia-16.2016.1
DO  - 10.2991/wartia-16.2016.1
ID  - Lu2016/05
ER  -