XU Ming-jie, YU Cheng-jian, SHEN Hang. Research on K-means Algorithm of Spark Parallelization[J]. Microelectronics & Computer, 2018, 35(5): 95-99.
Citation: XU Ming-jie, YU Cheng-jian, SHEN Hang. Research on K-means Algorithm of Spark Parallelization[J]. Microelectronics & Computer, 2018, 35(5): 95-99.

Research on K-means Algorithm of Spark Parallelization

  • In view of the problem of insufficient memory caused by the increase of iterative computation in the process of mass data processing in K-means algorithm, this paper proposes K-means algorithm of Spark parallization.the algorithm uses particle swarm optimization (PSO) to improve the global search ability of K-means to get the initial clustering center.Through the iterative computing power of Spark, the K-means algorithm is combined with the Spark parallel framework to improve the processing speed of the model and reduce the overall running time of the algorithm.The experiment was carried out by disease detection data, the experimental results show that the Spark parallelized PSOK-means algorithm greatly improves the efficiency and accuracy of the algorithm, It will be good application scenarios for the clustering of massive data.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return