Spam filtering using a logistic regression model trained by an artificial bee colony algorithm


Dedeturk B. K. , AKAY B.

APPLIED SOFT COMPUTING, vol.91, 2020 (Journal Indexed in SCI) identifier identifier

  • Publication Type: Article / Article
  • Volume: 91
  • Publication Date: 2020
  • Doi Number: 10.1016/j.asoc.2020.106229
  • Title of Journal : APPLIED SOFT COMPUTING
  • Keywords: Spam filtering, Artificial bee colony, Naive Bayes, Support vector machines, Logistic regression, Turkish emails, NEGATIVE SELECTION ALGORITHM

Abstract

Email spam is a serious problem that annoys recipients and wastes their time. Machine-learning methods have been prevalent in spam detection systems owing to their efficiency in classifying mail as solicited or unsolicited. However, existing spam detection techniques usually suffer from low detection rates and cannot efficiently handle high-dimensional data. Therefore, we propose a novel spam detection method that combines the artificial bee colony algorithm with a logistic regression classification model. The empirical results on three publicly available datasets (Enron, CSDMC2010, and TurkishEmail) show that the proposed model can handle high-dimensional data thanks to its highly effective local and global search abilities. We compare the proposed model's spam detection performance to those of support vector machine, logistic regression, and naive Bayes classifiers, in addition to the performance of the state-of-the-art methods reported by previous studies. We observe that the proposed method outperforms other spam detection techniques considered in this study in terms of classification accuracy. (C) 2020 Elsevier B.V. All rights reserved.