基于FPGA的卷积神经网络卷积层并行加速结构设计

FPGA-based Design of Accelerator for Convolution Layer of Convolutional Neural Network

摘要: 随着近年来硬件的飞速发展, 深度学习又一次成为了研究的热门领域, 其中卷积神经网络在多个方面显示了突出的表现.卷积层是卷积神经网络中最重要的组成部分, 具有大量乘加计算.针对该特点, 提出了流水线式的FPGA卷积层并行加速模块.该电路可以在一个周期内获得一个计算结果.在相同结构和数据集的情况下, FPGA的计算效率分别是CPU, GPU的近7倍和5倍, 而功耗只有GPU的28.87%.

Abstract: With the development of hardware, deep learning has been a hot area again, in which Convolutional Neural Network (CNN) shows excellent performance in several aspects. Convolution layer is the most important part of CNN, and has lots of multiplications and additions. For this feature, a FPGA-based accelerator with pipelineis designed for convolution layer. The designed circuit can compute one result in a single clock cycle. Under the same framework and dataset, FPGA has nearly 7x and 5x computational efficiency of CPU and GPU, and has only 28.87% power consumption of GPU.