Proceedings of the First International Conference on Advances in Forensics and Cyber Technologies (ICFACT 2025)

An Explainable and Robust Machine Learning Framework for Polymorphic Malware Detection

Authors
U. Abhiram Patel1, *, P. Lakshmi Narayan1, Supriya Goel1, K. V. Pradeepthi1
1Department of Computer Science, C.R.Rao Advanced Institute of Mathematics, Statistics and Computer Science (AIMSCS), Hyderabad, India
*Corresponding author. Email: abhirampatel@crraoaimscs.res.in
Corresponding Author
U. Abhiram Patel
Available Online 5 May 2026.
DOI
10.2991/978-94-6239-610-4_41How to use a DOI?
Keywords
Malware Analysis; Polymorphic Malware; Machine Learning; Feature Extraction; Explainability; SHAP
Abstract

Polymorphic Malware analysis has become a critical problem in Cyber Security and application of Machine Learning to the same is proving to be very useful as traditional signature based methods are failing. The malware is evolving continuously, so the machine learning algorithms being used should be good at generalizing. In this paper, we have used three dataset of real-world malware samples, Microsoft Malware dataset, DikeDataset and Malware Opcodes-Virus Share dataset. Random Forest algorithm is able to give 82% accuracy and XGboost is giving 85% accuracy, Performance evaluation using various cross-validation techniques was performed. The SHAP explainability algorithm was also applied to understand which features are contributing better for model performance. Our proposed algorithm is able to generalize well and when tested on unseen data, provides good accuracy.

Copyright
© 2026 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the First International Conference on Advances in Forensics and Cyber Technologies (ICFACT 2025)
Series
Advances in Computer Science Research
Publication Date
5 May 2026
ISBN
978-94-6239-610-4
ISSN
2352-538X
DOI
10.2991/978-94-6239-610-4_41How to use a DOI?
Copyright
© 2026 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - U. Abhiram Patel
AU  - P. Lakshmi Narayan
AU  - Supriya Goel
AU  - K. V. Pradeepthi
PY  - 2026
DA  - 2026/05/05
TI  - An Explainable and Robust Machine Learning Framework for Polymorphic Malware Detection
BT  - Proceedings of the First International Conference on Advances in Forensics and Cyber Technologies (ICFACT 2025)
PB  - Atlantis Press
SP  - 477
EP  - 487
SN  - 2352-538X
UR  - https://doi.org/10.2991/978-94-6239-610-4_41
DO  - 10.2991/978-94-6239-610-4_41
ID  - Patel2026
ER  -