基于神经元容错度分析的神经网络裁剪与近似计算技术研究

Research on neural network pruning and approximate computing technology based on neuron fault tolerance analysis

摘要: 本文将神经元裁剪和近似计算技术相结合, 首先提出基于统计排序的神经元容错能力量化方法.然后, 为了识别神经元的裁剪度, 根据神经元的容错能力提出神经元重要程度排序算法.其次, 引入轻量级的重训练, 提出循环裁剪法, 以探寻最优裁剪率.最后, 根据神经元的容错能力, 在神经网络运行过程中使用近似计算技术进一步降低功耗开销.本文通过两个实验, 证明了该技术的有效性, 其中以MNIST为例, 在精度损失小于5%的情况下, 压缩率达到50%, 节能1.35倍.

Abstract: This paper proposes to use neuron node pruning and approximate computing simultaneously. First, we propose a method to quantify the fault tolerance capability of neurons based on statistics. Then, to identify whether the neuron can be pruned, an importance ranking algorithm is proposed based on the fault tolerance capability. Next, introducing retrain and cyclic pruning to find the optimal pruning rate. Finally, approximate computing technique is used to further reduce power consumption during neuron network execution. The effectiveness of above technique is proved by two experiments. In the case of MNIST dataset, the compression rate is 50% and the power saving is 1.35×when the output accuracy loss is less than 5%.