Pattern Recognition Letters, vol.181, pp.77-85, 2024 (SCI-Expanded)
Discretization is one of the important pre-processing steps for supervised learning. Discretizing attributes helps to simplify the data and make it easier to understand and analyze by reducing the number of values. It can provide a better representation of knowledge and thus help improve the accuracy of a classifier. However, to minimize the information loss, it is important to consider the characteristics of the data. Most approaches assume that the values of a continuous attribute are monotone with respect to the probability of belonging to a particular class. In other words, it is assumed that increasing or decreasing the value of the attribute leads to a proportional increase or decrease in the classification score. This assumption may not always be valid for all attributes of data. In this study, we present entropy-based, flexible discretization strategies capable of capturing the non-monotonicity of the attribute values. The algorithm can adjust the number of cut points and values depending on the characteristics of the data. It does not require setting of any hyper-parameter or threshold. Extensive experiments on different datasets have shown that the proposed discretizers significantly improve the performance of classifiers, especially on complex and high-dimensional data sets.