Abstract:
The single-bit data width characteristic of Binary Neural Network (BNN) can tackle large-scale-data and huge-amount-calculation in Convolution Neural Network (CNN). In order to further accelerate the forward inference of BNN and reduce the required power consumption, a fully binarized neural network accelerator based on FPGA is proposed, in which the input image and edge padding are all binarized. And the accelerator skips the redundant calculations by reusing the Row Convolution LUT (RC-LUT) in a time-sharing way. By implementing on Xilinx's ZCU102 FPGA, this accelerator can achieve a Performance of more than 3.1 TOP/s, an Area Efficiency of 144.2 GOPS/KLUT, and a Power Efficiency of 3 507.8 GOPS/W.