Proceedings of the International Conference on Intelligent Data Analysis and Applications (IDAA 2025)

Speech Emotion Recognition Using MFCC Audio Features: A Comparative Machine Learning Approach

Authors
Emran Mahmud1, *, Md Mahmud Murshid1, Arpita Barua1, Md Shakil Parvez1, Md Sadekur Rahman1
1Department of Computer Science & Engineering, Daffodil International University, Dhaka, Bangladesh
*Corresponding author. Email: mahmud15-6111@diu.edu.bd
Corresponding Author
Emran Mahmud
Available Online 8 June 2026.
DOI
10.2991/978-94-6239-664-7_85How to use a DOI?
Keywords
Emotion recognition; MFCC; machine learning; deep learning; ensemble learning; voice analysis
Abstract

Emotion recognition from speech is crucial for allowing machines to comprehend and react to human emotions, which makes it extremely important for use cases like virtual assistants, healthcare diagnosis, and customer care automation. In this paper, we propose an effective and scalable emotion recognition system by audio feature extraction and machine learning algorithms. We utilize Mel Frequency Cepstral Coefficients (MFCCs) to extract the most suitable speech extracts features from audio recordings, providing efficient representation of voice emotions. We trained and tested a bunch of machine learning models on a labeled dataset. These included KNN, Logistic Regression, Decision Tree, Random Forest, XGBoost, LightGBM, MLP, and CNN. Our proposed system has attained a highest accuracy of 97.94% with a Soft Voting Ensemble method, surpassing individual models and demonstrating the strength of ensemble methods. The results of the experiments confirm that combining different classifiers significantly enhances emotion classification performance. Furthermore, we show that certain models like Random Forest, LightGBM, and KNN each individually perform well with an accuracy of around 96.91%, indicating the strength of MFCC features. This study helps grow affective computing by giving a full, data-focused way to spot emotions in voices using regular and deep learning methods.

Copyright
© 2026 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the International Conference on Intelligent Data Analysis and Applications (IDAA 2025)
Series
Advances in Intelligent Systems Research
Publication Date
8 June 2026
ISBN
978-94-6239-664-7
ISSN
1951-6851
DOI
10.2991/978-94-6239-664-7_85How to use a DOI?
Copyright
© 2026 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Emran Mahmud
AU  - Md Mahmud Murshid
AU  - Arpita Barua
AU  - Md Shakil Parvez
AU  - Md Sadekur Rahman
PY  - 2026
DA  - 2026/06/08
TI  - Speech Emotion Recognition Using MFCC Audio Features: A Comparative Machine Learning Approach
BT  - Proceedings of the International Conference on Intelligent Data Analysis and Applications (IDAA 2025)
PB  - Atlantis Press
SP  - 1261
EP  - 1276
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6239-664-7_85
DO  - 10.2991/978-94-6239-664-7_85
ID  - Mahmud2026
ER  -