A hybrid CNN-LSTM model for high resolution melting curve classification


ÖZKÖK F. Ö., ÇELİK M.

Biomedical Signal Processing and Control, vol.71, 2022 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 71
  • Publication Date: 2022
  • Doi Number: 10.1016/j.bspc.2021.103168
  • Journal Name: Biomedical Signal Processing and Control
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, EMBASE, INSPEC
  • Keywords: Classification, Convolutional neural network, Long short-term memory, Deep learning, Real time PCR, High resolution melting curve, CONVOLUTIONAL NEURAL-NETWORK, POLYMERASE-CHAIN-REACTION, IDENTIFICATION, DIAGNOSIS
  • Erciyes University Affiliated: Yes

Abstract

© 2021 Elsevier LtdHigh resolution melting (HRM) curve analysis is an efficient, correct, and rapid technique for analyzing real-time polymerase chain reaction (PCR) results. HRM curves are formed based on increasing temperature and decreasing amount of fluorescent dye in real-time PCR process. The shapes of them are unique for each species due to the sequence, length, and GC content of species' DNA. In the literature, the classification of HRM curves is usually conducted through visual inspection and a limited number of data mining methods have been used to classify these curves. However, it becomes challenging as the number of species and their samples and the number of closely related species increase. In this study, a hybrid classification model, which is based on convolutional neural network (CNN) and long short-term memory (LSTM) models, is proposed to classify HRM curves, efficiently. In the proposed CNN-LSTM model, CNN model was used for feature extraction, and LSTM model was used for classification. It takes both the HRM curves and derivative curves as inputs and gives the predicted species of HRM curves as outputs. The performance of the proposed CNN-LSTM model was compared with that of CNN and support vector machines (SVM) approaches. The results show that the proposed CNN-LSTM model outperforms other models. The accuracy, macro-average of F1, specificity, precision, and recall values of the proposed model were 0.96±0.02,0.95±0.02,1±0,0.96±0.02, and 0.96±0.02, respectively.