Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023)

Enhancing Spam Filter Using Naive Bayes and Count Vectorizer

Authors
Jiachen Liang1, *
1Department of Computer Science, Pennsylvania State University, 201 Old Main, 16802, State College, USA
*Corresponding author. Email: jpl6373@psu.edu
Corresponding Author
Jiachen Liang
Available Online 27 November 2023.
DOI
10.2991/978-94-6463-300-9_58How to use a DOI?
Keywords
spam filter; Naive Bayes; Count Vectorizer; SVM
Abstract

This study delves into advancements in the realm of email spam filtration, a critical pillar in augmenting email security infrastructure. Given the unceasing challenges presented by unwarranted spam, the deployment of efficacious spam filtration methodologies remains imperative. Contemporary strategies encompass IP address filtering, rule-based filtering, and the employment of Naive Bayes algorithms. However, these methodologies often succumb to the continuously evolving spamming techniques. To counter these drawbacks, the current study proposes an enhanced spam filtering architecture anchored in machine learning techniques, which involves augmented spam data procurement, data processing, feature extraction via Term Frequency-Inverse Document Frequency (TF-IDF), and the implementation of machine learning models such as Naive Bayes and Support Vector Machines (SVM). This research conducts a comparative analysis of these machine learning classifiers and underscores the superior performance of the SVM Linear model in spam detection, achieving elevated accuracy levels while ensuring balanced precision and recall for both spam and non-spam emails. These findings underscore promising strides in the arena of email security. The study culminates by advocating for persistent research and the incorporation of advanced techniques to augment the accuracy and user-friendly nature of spam filtering systems.

Copyright
© 2023 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023)
Series
Advances in Computer Science Research
Publication Date
27 November 2023
ISBN
10.2991/978-94-6463-300-9_58
ISSN
2352-538X
DOI
10.2991/978-94-6463-300-9_58How to use a DOI?
Copyright
© 2023 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Jiachen Liang
PY  - 2023
DA  - 2023/11/27
TI  - Enhancing Spam Filter Using Naive Bayes and Count Vectorizer
BT  - Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023)
PB  - Atlantis Press
SP  - 564
EP  - 573
SN  - 2352-538X
UR  - https://doi.org/10.2991/978-94-6463-300-9_58
DO  - 10.2991/978-94-6463-300-9_58
ID  - Liang2023
ER  -