ZHAO Tong, LIU Bin, LI Tao. Research on Parallel Text Classification System Based on Non-Balanced LSH[J]. Microelectronics & Computer, 2017, 34(12): 67-73.
Citation: ZHAO Tong, LIU Bin, LI Tao. Research on Parallel Text Classification System Based on Non-Balanced LSH[J]. Microelectronics & Computer, 2017, 34(12): 67-73.

Research on Parallel Text Classification System Based on Non-Balanced LSH

  • In order to solve the problem of low efficiency of the K-Nearset Neighbors(KNN) classification algorithm in face of massive text, a non-balanced local sensitive hash classification algorithm based on hyper-plane is proposed, which has a more significant effect than the traditional local sensitive hash algorithm on improving the accuracy and real-time performance. At the same time, in order to further reduce the execution time of the classification algorithm and improve the classification efficiency, an efficient parallel text classification system baseed on Hadoop is designed which combines the classification algorithm and the Spark parallel computing model. The experimental results show that such text classification system has a high classification speed and a high classification accuracy.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return