Spam filtering using a logistic regression model trained by an artificial bee colony algorithm


Dedeturk B. K., AKAY B.

APPLIED SOFT COMPUTING, cilt.91, 2020 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 91
  • Basım Tarihi: 2020
  • Doi Numarası: 10.1016/j.asoc.2020.106229
  • Dergi Adı: APPLIED SOFT COMPUTING
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC
  • Anahtar Kelimeler: Spam filtering, Artificial bee colony, Naive Bayes, Support vector machines, Logistic regression, Turkish emails, NEGATIVE SELECTION ALGORITHM
  • Erciyes Üniversitesi Adresli: Evet

Özet

Email spam is a serious problem that annoys recipients and wastes their time. Machine-learning methods have been prevalent in spam detection systems owing to their efficiency in classifying mail as solicited or unsolicited. However, existing spam detection techniques usually suffer from low detection rates and cannot efficiently handle high-dimensional data. Therefore, we propose a novel spam detection method that combines the artificial bee colony algorithm with a logistic regression classification model. The empirical results on three publicly available datasets (Enron, CSDMC2010, and TurkishEmail) show that the proposed model can handle high-dimensional data thanks to its highly effective local and global search abilities. We compare the proposed model's spam detection performance to those of support vector machine, logistic regression, and naive Bayes classifiers, in addition to the performance of the state-of-the-art methods reported by previous studies. We observe that the proposed method outperforms other spam detection techniques considered in this study in terms of classification accuracy. (C) 2020 Elsevier B.V. All rights reserved.