WANG B Y,YANG Z J,XIE C,et al. Optimal design of computing resources for CNN convolution layer hardware[J]. Microelectronics & Computer,2024,41(7):89-95. doi: 10.19304/J.ISSN1000-7180.2023.0436
Citation: WANG B Y,YANG Z J,XIE C,et al. Optimal design of computing resources for CNN convolution layer hardware[J]. Microelectronics & Computer,2024,41(7):89-95. doi: 10.19304/J.ISSN1000-7180.2023.0436

Optimal design of computing resources for CNN convolution layer hardware

  • The traditional Convolutional Neural Network(CNN) dedicated accelerator will produce the low hardware resource utilization problem when realizing the convolution layer operator reconstruction, data multiplexing and computational resource reuse. A hardware architecture based on the combination of dynamic Register file and reconfigurable PE array is designed to balance the load of each PE unit by optimizing the data stream, thus improving the utilization of computing resources in the convolution layer. It can flexibly deploy odd convolution kernel with 0 to 11 size and 1 to 10 step length, and support multi-channel parallel convolution and input data multiplexing operations. The design is implemented using verilog hardware description language, and functional verification is carried out by creating UVM environment. The experiments show that when accelerating the convolutional layer of the AlexNet model, the throughput of peak computing power is increased by 9.5% to 64.3% compared with relevant studies. When mapping convolutional kernels of different sizes and steps in five classical neural networks, the average utilization rate of PE units is increased by 4% to 11% compared with relevant studies.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return