A Comparative Study of Cost-Sensitive Learning Algorithm Based on Imbalanced Data Sets
-
Abstract
Most studies on the imbalanced data set classification focused on discussion of re-sampling or cost-sensitive learning systems themselves,however,the fact that imbalanced class distribution and misclassification errors cost unequally always occurring simultaneously was neglected.On the basis of analyzing the theory and algorithm of cost-sensitive learning,a novel hybrid re-sampling technique based on Automated Adaptive Selection of the Number of Nearest Neighbors in order to solve the misclassification problem of imbalanced data set is proposed.We compared hybrid re-sampling algorithm and MetaCost algorithm,Experiment results show that the proposed method can improve the classification accuracy and decrease the misclassification cost effectively.The experimental results confirm that this algorithm is superior to the traditional algorithms as for dealing with the imbalanced problem.
-
-