Secure Deepfake Audio Detection with a Soft-Voting Ensemble of PGD-Hardened Heterogeneous Models
- DOI
- 10.2991/978-94-6239-664-7_64How to use a DOI?
- Keywords
- Deepfake Audio Detection; Ensemble Learning; Adversarial Robustness; Projected Gradient Descent (PGD); Soft Voting; Mel-Spectrograms; Audio Classification
- Abstract
This study introduces a dependable method for detecting deepfake audio by combining multiple deep learning models into a single, unified system. The approach integrates two ResNet models and one CNN model, using a soft-voting strategy to merge their predictions and achieve higher overall accuracy and stability. To defend against adversarial attacks—small changes meant to fool the system—we employ adversarial training with the Projected Gradient Descent (PGD) method. This process strengthens the models by helping them learn more robust features, making the system significantly harder to bypass. Through extensive testing, our method achieved an accuracy of 89.00% and an F1-score of 90.04%, representing a 3.83% improvement over the strongest individual model. Moreover, the system demonstrated exceptional resistance to PGD attacks, with a success rate of only 0.16%. By combining diverse model architectures and incorporating proactive defenses, this research offers a practical and trustworthy solution for deepfake audio detection, contributing to greater security and authenticity in digital communications.
- Copyright
- © 2026 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Aisha Tasnim Aishy AU - Abdur Rahman Wahid AU - Rafshia Mahbuba Ayshe AU - M. Shahriar Mahmud Rafi AU - Mohammed Maruf Hossen AU - Fairuz Nowshin Tohfa PY - 2026 DA - 2026/06/08 TI - Secure Deepfake Audio Detection with a Soft-Voting Ensemble of PGD-Hardened Heterogeneous Models BT - Proceedings of the International Conference on Intelligent Data Analysis and Applications (IDAA 2025) PB - Atlantis Press SP - 932 EP - 946 SN - 1951-6851 UR - https://doi.org/10.2991/978-94-6239-664-7_64 DO - 10.2991/978-94-6239-664-7_64 ID - Aishy2026 ER -