Enhancing Spam Filter Using Naive Bayes and Count Vectorizer

Jiachen Liang

doi:10.2991/978-94-6463-300-9_58

<Previous Article In Volume

Next Article In Volume>

Enhancing Spam Filter Using Naive Bayes and Count Vectorizer

Authors

Jiachen Liang¹^{, *}

¹Department of Computer Science, Pennsylvania State University, 201 Old Main, 16802, State College, USA

^*Corresponding author. Email: jpl6373@psu.edu

Corresponding Author

Jiachen Liang

Available Online 27 November 2023.

DOI: 10.2991/978-94-6463-300-9_58 How to use a DOI?
Keywords: spam filter; Naive Bayes; Count Vectorizer; SVM
Abstract: This study delves into advancements in the realm of email spam filtration, a critical pillar in augmenting email security infrastructure. Given the unceasing challenges presented by unwarranted spam, the deployment of efficacious spam filtration methodologies remains imperative. Contemporary strategies encompass IP address filtering, rule-based filtering, and the employment of Naive Bayes algorithms. However, these methodologies often succumb to the continuously evolving spamming techniques. To counter these drawbacks, the current study proposes an enhanced spam filtering architecture anchored in machine learning techniques, which involves augmented spam data procurement, data processing, feature extraction via Term Frequency-Inverse Document Frequency (TF-IDF), and the implementation of machine learning models such as Naive Bayes and Support Vector Machines (SVM). This research conducts a comparative analysis of these machine learning classifiers and underscores the superior performance of the SVM Linear model in spam detection, achieving elevated accuracy levels while ensuring balanced precision and recall for both spam and non-spam emails. These findings underscore promising strides in the arena of email security. The study culminates by advocating for persistent research and the incorporation of advanced techniques to augment the accuracy and user-friendly nature of spam filtering systems.
Copyright: © 2023 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023)
Series: Advances in Computer Science Research
Publication Date: 27 November 2023
ISBN: 10.2991/978-94-6463-300-9_58
ISSN: 2352-538X
DOI: 10.2991/978-94-6463-300-9_58 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - Jiachen Liang
PY  - 2023
DA  - 2023/11/27
TI  - Enhancing Spam Filter Using Naive Bayes and Count Vectorizer
BT  - Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023)
PB  - Atlantis Press
SP  - 564
EP  - 573
SN  - 2352-538X
UR  - https://doi.org/10.2991/978-94-6463-300-9_58
DO  - 10.2991/978-94-6463-300-9_58
ID  - Liang2023
ER  -

download .riscopy to clipboard