Proceedings of the International Conference on Intelligent Data Analysis and Applications (IDAA 2025)

BiLSTM-Based Smishing Detection for Bangla SMS

Authors
Anmay Paul Arpan1, Rajoshree Ghatak1, Md. Mahmudul Hasan1, Anuj Roy1, Md Azijul Haque1, Sadman Sadik Khan1, *
1Daffodil International University, Dhaka, Bangladesh
*Corresponding author. Email: sadman15-13696@diu.edu.bd
Corresponding Author
Sadman Sadik Khan
Available Online 8 June 2026.
DOI
10.2991/978-94-6239-664-7_35How to use a DOI?
Keywords
Bangla SMS classification; Natural Language Processing (NLP); Bidirectional Long Short- Term Memory (BiLSTM); Smishing detection; SMSbased phishing; Morphologically rich languages; Lowresource language processing
Abstract

A morphologically sophisticated and diglossic Bangla is a difficult language for Natural Language Processing (NLP), particularly for security tools such as smishing (SMS-based phishing) detection. This paper proposes a Bidirectional Long Short-Term Memory (BiLSTM)-based model to identify Bangla SMS as normal, promotional, or smishing based on an evenly divided dataset of 2,772 messages. After preprocessing with tokenization, normalization, and padding, the model was trained with the Adam optimizer, class-weighted loss, and early stopping. Based on experimental outcomes, the BiLSTM achieved an overall accuracy of 95recall, and F1-score were averaged at 0.95. While normal and promotional SMS were put into the good performance class (F1 = 0.95 and 0.98, respectively), smishing messages attained a precision of 0.98 but recall of 0.89 which was lower due to misclassifications to the normal class. ROC analysis also confirmed strength with 1.00 AUC readings for normal and promotional, and 0.99 for smishing, establishing the benchmark of Bangla smishing detection and indicating the need for advanced techniques to reduce false negatives even further.

Copyright
© 2026 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the International Conference on Intelligent Data Analysis and Applications (IDAA 2025)
Series
Advances in Intelligent Systems Research
Publication Date
8 June 2026
ISBN
978-94-6239-664-7
ISSN
1951-6851
DOI
10.2991/978-94-6239-664-7_35How to use a DOI?
Copyright
© 2026 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Anmay Paul Arpan
AU  - Rajoshree Ghatak
AU  - Md. Mahmudul Hasan
AU  - Anuj Roy
AU  - Md Azijul Haque
AU  - Sadman Sadik Khan
PY  - 2026
DA  - 2026/06/08
TI  - BiLSTM-Based Smishing Detection for Bangla SMS
BT  - Proceedings of the International Conference on Intelligent Data Analysis and Applications (IDAA 2025)
PB  - Atlantis Press
SP  - 504
EP  - 515
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6239-664-7_35
DO  - 10.2991/978-94-6239-664-7_35
ID  - Arpan2026
ER  -