彭井桐, 祝永新, 汪辉, 孔祥聪, 张钦润, 郭振堂. 基于FPGA的GRU神经网络飞行数据异常检测[J]. 微电子学与计算机, 2021, 38(11): 67-73. DOI: 10.19304/J.ISSN1000-7180.2021.0103
引用本文: 彭井桐, 祝永新, 汪辉, 孔祥聪, 张钦润, 郭振堂. 基于FPGA的GRU神经网络飞行数据异常检测[J]. 微电子学与计算机, 2021, 38(11): 67-73. DOI: 10.19304/J.ISSN1000-7180.2021.0103
PENG Jingtong, ZHU Yongxin, WANG Hui, KONG Xiangcong, ZHANG Qinrun, GUO Zhentang. GRU network for flight-data anomaly detection based on FPGA[J]. Microelectronics & Computer, 2021, 38(11): 67-73. DOI: 10.19304/J.ISSN1000-7180.2021.0103
Citation: PENG Jingtong, ZHU Yongxin, WANG Hui, KONG Xiangcong, ZHANG Qinrun, GUO Zhentang. GRU network for flight-data anomaly detection based on FPGA[J]. Microelectronics & Computer, 2021, 38(11): 67-73. DOI: 10.19304/J.ISSN1000-7180.2021.0103

基于FPGA的GRU神经网络飞行数据异常检测

GRU network for flight-data anomaly detection based on FPGA

  • 摘要: 商用大型飞机飞行数据异常检测存在实时性要求高、检测点多等难点,使用传统时间序列处理软件存在处理时间长等缺点.本文提出基于可编程阵列芯片(FPGA)的门控循环单元(GRU)异常检测神经网络, 对飞行振动数据源做时间序列分析异常检测.为满足高频采样数据的实时处理要求,对GRU的实现进行了多方面并行加速优化.一是提出一种结构化的并行优化方法.将权重参数保存在FPGA片上内存中并对数组做维度上的切割,使权重参数能在列维度上并行读取,配合对矩阵向量乘法的并行计算优化,实现GRU网络的高效计算效率.二是优化GRU网络激活函数的计算方式.使用片上内存做查找表,以流水线方式大幅减少了激活函数操作的延时和计算资源消耗.三是对GRU网络的数据通路做调整.通过优化计算顺序,消除了两组矩阵向量乘法的依赖关系,将关键延时降低了40%.测试结果表明提出的GRU飞行异常检测FPGA硬件加速器达到了高吞吐率10.33GFLOPS和低功耗2.532 w.

     

    Abstract: The anomaly detection of flight data on commercial large aircraft has difficulties in high real-time requirement and massive testing points. The use of traditional time series processing software has disadvantages such as long processing time. This paper proposes a Gated Recurrent Unit(GRU) anomaly detection neural network based on FPGA, which is used for time series analysis of flight vibration data sources for anomaly detection. In order to meet the real-time processing requirements of high frequency sampling data, the implementation of GRU is optimized in many aspects of parallel acceleration. The first is to propose a structured parallel optimization method. The weight parameters are stored in the FPGA on-chip memory and the array is cut in dimensions, so that the weight parameters can be read in parallel in the column dimension, and the parallel calculation optimization of the matrix-vector multiplication is implemented to achieve the efficient calculation efficiency of the GRU network.The first is to propose a structured parallel optimization method. The weight parameters are stored in the FPGA on-chip memory and the array is partition in dimension, so that the weight parameters can be read in parallel in the column dimension, and the parallel calculation optimization of the matrix vector multiplication is implemented to achieve the efficient calculation efficiency of the GRU network. The second is to optimize the calculation method of the GRU network activation function. The use of BRAMs as Lookup-table greatly reduces the delay of activation function operations and the consumption of computing resources in a pipeline way.The third is to adjust the data path of the GRU network. By optimizing the calculation sequence, , the dependence of the two sets of matrix vector multiplication is eliminated, and the critical delay is reduced by 40%. The experimental results show that a hardware accelerator of high energy efficiency ratio is achieved with 10.33GFLOPS throughput and 2.532w power consumption.

     

/

返回文章
返回