YAO Li, ZHANG Xi-huang. An Improved Random Forest Algorithm for Multi-Class Imbalanced Data Problem Under Map Reduce[J]. Microelectronics & Computer, 2018, 35(11): 139-144.
Citation: YAO Li, ZHANG Xi-huang. An Improved Random Forest Algorithm for Multi-Class Imbalanced Data Problem Under Map Reduce[J]. Microelectronics & Computer, 2018, 35(11): 139-144.

An Improved Random Forest Algorithm for Multi-Class Imbalanced Data Problem Under Map Reduce

  • Because the traditional random forest algorithm under the MapReduce still takes the global optimal point as the dividing point when dealing with the multi-class imbalance data problem, ignoring the influence of the minority class on the classification accuracy rate, this paper presents an improved random forest algorithm (MR-RF-SHDSE) for dealing with multi-class imbalance data under MapReduce. This algorithm uses the stratified sampling method to sample the samples in each category, and uses the HDDT decision tree as the learner to weaken the impact of data bias on the classification accuracy. Finally, the GMean value and the disagreement measure value of the decision tree are calculated, we use harmonic mean as a metric to select decision trees. Experiments show that compared with other algorithms, MR-RF-SHDSE can effectively improve the classification accuracy of multi-class imbalanced data sets.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return