Static facial expression recognition using convolutional neural networks based on transfer learning and hyperparameter optimization


Özcan T., Baştürk A.

MULTIMEDIA TOOLS AND APPLICATIONS, vol.79, pp.26587-26604, 2020 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 79
  • Publication Date: 2020
  • Doi Number: 10.1007/s11042-020-09268-9
  • Journal Name: MULTIMEDIA TOOLS AND APPLICATIONS
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, FRANCIS, ABI/INFORM, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC, zbMATH
  • Page Numbers: pp.26587-26604
  • Keywords: ERUFER dataset, Static facial expression recognition, Emotion analysis, Deep learning, Image processing, Hyperparameter optimization, CLASSIFIERS, ALGORITHM, FUSION
  • Erciyes University Affiliated: Yes

Abstract

Expression recognition (ER), which has been frequently used in human-computer interaction, uses visual data such as video and static images or sensor-based data for recognizing. Facial expression recognition (FER) is a visual data based ER. Since videos have sequential images, it can be easier to recognize emotion in video signals rather than static images which consist of a single plain image. Therefore, FER on static images is a relatively tough task. Recently, deep learning methods have introduced increased success in classification problems. Accordingly, these methods are also used for FER in the literature. Data preparation and hyperparameter optimization can be utilized to increase the success of deep learning methods. With the preparation of data, the features become more pronounced. Increasing the number of training samples directly also generally affects the success rate. Tuning the hyperparameters of deep learning is another factor that increases the performance of the models. In this study, a classification method including data preparation, hyperparameter optimization, and a transfer learning aided convolutional neural network is proposed. Through the study, a new dataset, named ERUFER, was created by using static images. The newly introduced dataset ERUFER and a popular public dataset JAFFE were classified by the proposed method. To the extent of our knowledge, the best result in the literature is achieved by the proposed method for the JAFFE dataset using a 10-fold cross-validation test technique. On the other hand, a success rate with 92.56 % is achieved for the ERUFER dataset.