2024 IEEE Global Energy Conference, GEC 2024, Batman, Türkiye, 4 - 06 Aralık 2024, ss.158-163, (Tam Metin Bildiri)
In this paper, the impact of data pre-processing strategies on the performance of forecasting models for solar power generation is comprehensively analyzed. The integration of outlier detection, normalization, and missing data handling techniques with popular ML models such as XGBoost, LightGBM, and Random Forest is investigated. The effects of six different pre-processing methods on three different ML models are examined over a total of 81 possible combinations. Two different scenarios have been used: one with a dataset without missing values (Scenario 1) and one with a dataset with missing values (Scenario 2). The results highlight that data pre-processing can significantly affect the accuracy and the importance of specific pre-processing methods in optimizing ML models for specific datasets and algorithms.