Speech Emotion Recognition Using MFCC Audio Features: A Comparative Machine Learning Approach
- DOI
- 10.2991/978-94-6239-664-7_85How to use a DOI?
- Keywords
- Emotion recognition; MFCC; machine learning; deep learning; ensemble learning; voice analysis
- Abstract
Emotion recognition from speech is crucial for allowing machines to comprehend and react to human emotions, which makes it extremely important for use cases like virtual assistants, healthcare diagnosis, and customer care automation. In this paper, we propose an effective and scalable emotion recognition system by audio feature extraction and machine learning algorithms. We utilize Mel Frequency Cepstral Coefficients (MFCCs) to extract the most suitable speech extracts features from audio recordings, providing efficient representation of voice emotions. We trained and tested a bunch of machine learning models on a labeled dataset. These included KNN, Logistic Regression, Decision Tree, Random Forest, XGBoost, LightGBM, MLP, and CNN. Our proposed system has attained a highest accuracy of 97.94% with a Soft Voting Ensemble method, surpassing individual models and demonstrating the strength of ensemble methods. The results of the experiments confirm that combining different classifiers significantly enhances emotion classification performance. Furthermore, we show that certain models like Random Forest, LightGBM, and KNN each individually perform well with an accuracy of around 96.91%, indicating the strength of MFCC features. This study helps grow affective computing by giving a full, data-focused way to spot emotions in voices using regular and deep learning methods.
- Copyright
- © 2026 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Emran Mahmud AU - Md Mahmud Murshid AU - Arpita Barua AU - Md Shakil Parvez AU - Md Sadekur Rahman PY - 2026 DA - 2026/06/08 TI - Speech Emotion Recognition Using MFCC Audio Features: A Comparative Machine Learning Approach BT - Proceedings of the International Conference on Intelligent Data Analysis and Applications (IDAA 2025) PB - Atlantis Press SP - 1261 EP - 1276 SN - 1951-6851 UR - https://doi.org/10.2991/978-94-6239-664-7_85 DO - 10.2991/978-94-6239-664-7_85 ID - Mahmud2026 ER -