Neural Computing and Applications, 2024 (SCI-Expanded)
Spam emails are sent to recipients for advertisement and phishing purposes. In either case, it disturbs recipients and reduces communication quality. Addressing this issue requires classifying emails on servers as either spam or ham. Numerous methods have been proposed for this classification task. Among them, logistic regression (LR) stands out for its simplicity, speed, and ease of implementation. However, LR suffers from low detection rates caused by the gradient descent algorithm used in its training phase. To overcome this limitation, we propose a novel method based on the clonal selection algorithm (CSA), renowned for its success in optimization problems due to its local and global search capabilities. Despite CSA’s effective optimization performance, it suffers from robustness and slow training time. Therefore, the CSA and artificial bee colony (ABC) algorithms are hybridized to improve CSA’s robustness and are parallelized to reduce the training time significantly. This hybrid method is employed to optimize the weights of LR by minimizing the cost at the output of LR. The empirical results denote that the proposed method, named CSA–ABC–LR, yields better classification performance compared to state-of-the-art models reported by previous studies, demonstrating an accuracy rate of 99.13% on the Enron-1 dataset, 99.22% on the CSDMC2010 dataset, and 94.49% on the Spambase dataset.