A comprehensive survey on optimizing deep learning models by metaheuristics

AKAY, BAHRİYE; KARABOĞA, DERVİŞ; AKAY, RÜŞTÜ

doi:10.1007/s10462-021-09992-0

A comprehensive survey on optimizing deep learning models by metaheuristics

AKAY B., KARABOĞA D., AKAY R.

ARTIFICIAL INTELLIGENCE REVIEW, cilt.55, sa.2, ss.829-894, 2022 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 55 Sayı: 2
Basım Tarihi: 2022
Doi Numarası: 10.1007/s10462-021-09992-0
Dergi Adı: ARTIFICIAL INTELLIGENCE REVIEW
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, ABI/INFORM, Aerospace Database, Applied Science & Technology Source, Communication Abstracts, Compendex, Computer & Applied Sciences, Educational research abstracts (ERA), Index Islamicus, INSPEC, Library and Information Science Abstracts, Metadex, Psycinfo, zbMATH, Civil Engineering Abstracts, Library, Information Science & Technology Abstracts (LISTA)
Sayfa Sayıları: ss.829-894
Anahtar Kelimeler: Deep neural networks, Metaheuristics, Training, Hyper-parameter optimization, Architecture optimization, Feature extraction
Erciyes Üniversitesi Adresli: Evet

Özet

Deep neural networks (DNNs), which are extensions of artificial neural networks, can learn higher levels of feature hierarchy established by lower level features by transforming the raw feature space to another complex feature space. Although deep networks are successful in a wide range of problems in different fields, there are some issues affecting their overall performance such as selecting appropriate values for model parameters, deciding the optimal architecture and feature representation and determining optimal weight and bias values. Recently, metaheuristic algorithms have been proposed to automate these tasks. This survey gives brief information about common basic DNN architectures including convolutional neural networks, unsupervised pre-trained models, recurrent neural networks and recursive neural networks. We formulate the optimization problems in DNN design such as architecture optimization, hyper-parameter optimization, training and feature representation level optimization. The encoding schemes used in metaheuristics to represent the network architectures are categorized. The evolutionary and selection operators, and also speed-up methods are summarized, and the main approaches to validate the results of networks designed by metaheuristics are provided. Moreover, we group the studies on the metaheuristics for deep neural networks based on the problem type considered and present the datasets mostly used in the studies for the readers. We discuss about the pros and cons of utilizing metaheuristics in deep learning field and give some future directions for connecting the metaheuristics and deep learning. To the best of our knowledge, this is the most comprehensive survey about metaheuristics used in deep learning field.