JIANG Hua, HAN Fei, WANG Xin, WANG Hui-jiao. Big Data Classification Algorithm Based on MapReduce to Improve K-NN[J]. Microelectronics & Computer, 2018, 35(10): 36-40, 45.
Citation: JIANG Hua, HAN Fei, WANG Xin, WANG Hui-jiao. Big Data Classification Algorithm Based on MapReduce to Improve K-NN[J]. Microelectronics & Computer, 2018, 35(10): 36-40, 45.

Big Data Classification Algorithm Based on MapReduce to Improve K-NN

  • Aiming at the shortcomings of traditional k-nearest neighbor algorithm (K-NN) classification algorithm, such as large amount of calculation and high-dimension massive data set processing efficiency, this paper revises the Map and Reduce functions based on Hadoop platform by using MapReduce distributed programming model. Principal component analysis and critical region data when the distance weighted method. First, the principal component analysis of high-dimensional data to achieve the purpose of reducing dimension, so as to improve operational efficiency; secondly, in the classification stage of prediction, adding the concept of complete region and critical region, the critical region of k values of n species distance weighted, Finally, the algorithm running under the Hadoop cluster environment can further improve its operation efficiency against massive data. The experimental results show that this algorithm greatly improves the computational efficiency and accuracy when dealing with massive data
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return