Proceedings of the International Conference on Intelligent Data Analysis and Applications (IDAA 2025)

Secure Deepfake Audio Detection with a Soft-Voting Ensemble of PGD-Hardened Heterogeneous Models

Authors
Aisha Tasnim Aishy1, Abdur Rahman Wahid1, Rafshia Mahbuba Ayshe1, M. Shahriar Mahmud Rafi1, *, Mohammed Maruf Hossen1, Fairuz Nowshin Tohfa1
1East Delta University, Chittagong, Bangladesh
*Corresponding author. Email: shahriarrafi30@gmail.com
Corresponding Author
M. Shahriar Mahmud Rafi
Available Online 8 June 2026.
DOI
10.2991/978-94-6239-664-7_64How to use a DOI?
Keywords
Deepfake Audio Detection; Ensemble Learning; Adversarial Robustness; Projected Gradient Descent (PGD); Soft Voting; Mel-Spectrograms; Audio Classification
Abstract

This study introduces a dependable method for detecting deepfake audio by combining multiple deep learning models into a single, unified system. The approach integrates two ResNet models and one CNN model, using a soft-voting strategy to merge their predictions and achieve higher overall accuracy and stability. To defend against adversarial attacks—small changes meant to fool the system—we employ adversarial training with the Projected Gradient Descent (PGD) method. This process strengthens the models by helping them learn more robust features, making the system significantly harder to bypass. Through extensive testing, our method achieved an accuracy of 89.00% and an F1-score of 90.04%, representing a 3.83% improvement over the strongest individual model. Moreover, the system demonstrated exceptional resistance to PGD attacks, with a success rate of only 0.16%. By combining diverse model architectures and incorporating proactive defenses, this research offers a practical and trustworthy solution for deepfake audio detection, contributing to greater security and authenticity in digital communications.

Copyright
© 2026 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the International Conference on Intelligent Data Analysis and Applications (IDAA 2025)
Series
Advances in Intelligent Systems Research
Publication Date
8 June 2026
ISBN
978-94-6239-664-7
ISSN
1951-6851
DOI
10.2991/978-94-6239-664-7_64How to use a DOI?
Copyright
© 2026 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Aisha Tasnim Aishy
AU  - Abdur Rahman Wahid
AU  - Rafshia Mahbuba Ayshe
AU  - M. Shahriar Mahmud Rafi
AU  - Mohammed Maruf Hossen
AU  - Fairuz Nowshin Tohfa
PY  - 2026
DA  - 2026/06/08
TI  - Secure Deepfake Audio Detection with a Soft-Voting Ensemble of PGD-Hardened Heterogeneous Models
BT  - Proceedings of the International Conference on Intelligent Data Analysis and Applications (IDAA 2025)
PB  - Atlantis Press
SP  - 932
EP  - 946
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6239-664-7_64
DO  - 10.2991/978-94-6239-664-7_64
ID  - Aishy2026
ER  -