Advanced hybrid machine learning methods for predicting rainfall time series: the situation at the Kütahya station in Türkiye


İLKENTAPAR M., ÇITAKOĞLU H., Talebi H., Akturk G., Spor P., Caglar Y., ...Daha Fazla

MODELING EARTH SYSTEMS AND ENVIRONMENT, cilt.11, sa.5, 2025 (ESCI, Scopus) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 11 Sayı: 5
  • Basım Tarihi: 2025
  • Doi Numarası: 10.1007/s40808-025-02539-0
  • Dergi Adı: MODELING EARTH SYSTEMS AND ENVIRONMENT
  • Derginin Tarandığı İndeksler: Emerging Sources Citation Index (ESCI), Scopus, Agricultural & Environmental Science Database, Geobase
  • Anahtar Kelimeler: Rainfall, Machine learning methods, Pre-processing technique, T & uuml;rkiye
  • Erciyes Üniversitesi Adresli: Evet

Özet

Long-term variations in rainfall patterns, known as rainfall variability, have increasingly impacted ecological and socioeconomic systems, particularly in regions with high sensitivity. Consequently, accurate forecasting of rainfall at both short- and long-term time scales is essential, necessitating a comprehensive analysis of historical rainfall time series data collected from meteorological stations. In this study, K & uuml;tahya Province was selected as the study area, utilizing monthly rainfall data from its sole meteorological station spanning the period from 1960 to 2023. The dataset was partitioned into a training set (January 1960-March 2008) and a test set (April 2008-December 2023). Lagged rainfall values at t-1, t-2, and t-3 were used as input variables to predict rainfall at time t. The primary objective of this research is to assess the effectiveness of various preprocessing techniques in developing hybrid machine learning models for rainfall prediction. Gaussian Process Regression (GPR), Support Vector Machines, and Adaptive Neuro-Fuzzy Inference System were employed as machine learning methods. Furthermore, multiple signal decomposition techniques, including Complete Ensemble Empirical Mode Decomposition (CEEMD), Tunable Q-Factor Wavelet Transform, Empirical Mode Decomposition, Robust Empirical Mode Decomposition, Variational Mode Decomposition, Empirical Wavelet Transform, and Ensemble Empirical Mode Decomposition (EEMD), were utilized as preprocessing steps to enhance model performance. The predictive performance of the developed hybrid models was evaluated using various statistical measures. Among the evaluated models, the CEEMD-GPR hybrid model exhibited the best prediction performance with Coefficient of Determination (R2 = 0.998) and Nash-Sutcliffe Efficiency (NSE = 0.998) values close to 1, Mean Absolute Error (MAE = 1.42) and Mean Squared Error (RMSE = 1.79) values close to zero. These findings indicate that CEEMD demonstrated superior decomposition efficiency compared to the other six decomposition techniques. Additionally, the Kruskal-Wallis test conducted during the analysis phase yielded a statistical significance level of p > 0.05, confirming that the observed and predicted rainfall data originated from the same distribution. Consequently, the effectiveness and reliability of the proposed hybrid models for rainfall prediction were validated.