Deeptaxim: Comprehensive classification analysis for taxonomic datasets using image-based deep-learning models

Ciftcioglu, U.; NALBANTOĞLU, ÖZKAN

doi:10.1016/j.compbiolchem.2026.109063

Deeptaxim: Comprehensive classification analysis for taxonomic datasets using image-based deep-learning models

Ciftcioglu U. G. E., NALBANTOĞLU Ö. U.

Computational Biology and Chemistry, cilt.124, 2026 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 124
Basım Tarihi: 2026
Doi Numarası: 10.1016/j.compbiolchem.2026.109063
Dergi Adı: Computational Biology and Chemistry
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, BIOSIS, Chemical Abstracts Core, Chimica, Compendex, MEDLINE, zbMATH
Anahtar Kelimeler: CNNs, Deep learning, Microbiome data, Taxonomy, Transfer learning, Wellness index
Erciyes Üniversitesi Adresli: Hayır

Özet

Advancements in deep learning have opened new possibilities for the classification of microbiome data, offering solutions to the challenges posed by its complexity and variability. This work explores the application of deep learning techniques for accurate and reliable classification of microbiome data, addressing the challenges of high-dimensionality and sparsity. Focusing on diseases known to be closely linked with gut microbiome alterations, we convert microbiome data into image format using the hierarchical structure of the taxonomic tree (cladogram). Our proposed model, Deeptaxim, leverages 2D-CNN-based Autoencoder, U-Net, and GAN architectures to enhance classification performance across two distinct dataset groups. The primary goals are to (1) utilize cladogram-based image data to capture complex microbial relationships, (2) develop optimized deep learning models for microbiome-based disease classification, (3) assess Deeptaxim's transfer learning capabilities for low-sample datasets, and (4) evaluate its robustness when applied to a broader range of diseases. Our findings demonstrate that the use of taxa-ordered images instead of tabular taxonomic data and employing CNN as a classifier led to superior classification performance compared to conventional methods typically used for taxonomic data. Furthermore, it proved that a model trained on a comprehensive dataset can significantly improve the classification performance on data with fewer examples or different disease types through transfer learning. Proposed model thanks to its NN-based framework, not only facilitates working with alternative datasets but also can be integrated into other NN-based methods as a head/neck module of other models. Thus, Deeptaxim can be adapted, extended, and ported to serve as a wellness index.