TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, cilt.28, sa.1, ss.262-274, 2020 (SCI-Expanded)
Lung cancer is one of the deadly cancer types, and almost 85% of lung cancers are nonsmall cell lung cancer (NSCLC). In the present study we investigated classification and feature selection methods for the differentiation of two subtypes of NSCLC, namely adenocarcinoma (ADC) and squamous cell carcinoma (SqCC). The major advances in understanding the effects of therapy agents suggest that future targeted therapies will be increasingly subtype specific. We obtained positron emission tomography (PET) images of 93 patients with NSCLC, 39 of which had ADC while the rest had SqCC. Random walk segmentation was applied to delineate three-dimensional tumor volume, and 39 texture features were extracted to grade the tumor subtypes. We examined 11 classifiers with two different feature selection methods and the effect of normalization on accuracy. The classifiers we used were the k-nearest-neighbor, logistic regression, support vector machine, Bayesian network, decision tree, radial basis function network, random forest, AdaBoostM1, and three stacking methods. To evaluate the prediction accuracy we performed a leave-one-out cross-validation experiment on the dataset. We also considered optimizing certain hyperparameters of these models by performing 10-fold cross-validation separately on each training set. We found that the stacking ensemble classifier, which combines a decision tree, AdaBoostM1, and logistic regression methods by a metalearner, was the most accurate method for detecting subtypes of NSCLC, and normalization of feature sets improved the accuracy of the classification method.