王阳, 陶华敏, 肖山竹, 戴华东. 基于脉动阵列的矩阵乘法器硬件加速技术研究[J]. 微电子学与计算机, 2015, 32(11): 120-124.
引用本文: 王阳, 陶华敏, 肖山竹, 戴华东. 基于脉动阵列的矩阵乘法器硬件加速技术研究[J]. 微电子学与计算机, 2015, 32(11): 120-124.
WANG Yang, TAO Hua-min, XIAO Shan-zhu, DAI Hua-dong. Hardware Acceleration Technology of Matrix Multiplier Based on Systolic Array[J]. Microelectronics & Computer, 2015, 32(11): 120-124.
Citation: WANG Yang, TAO Hua-min, XIAO Shan-zhu, DAI Hua-dong. Hardware Acceleration Technology of Matrix Multiplier Based on Systolic Array[J]. Microelectronics & Computer, 2015, 32(11): 120-124.

基于脉动阵列的矩阵乘法器硬件加速技术研究

Hardware Acceleration Technology of Matrix Multiplier Based on Systolic Array

  • 摘要: 针对卡尔曼滤波算法中矩阵乘法运算的求解问题,比较不同的硬件加速设计方案,利用9个自行设计的处理单元,设计了一种基于脉动阵列的并行结构浮点矩阵乘法器,其峰值性能可达761.96MFLOPS,在资源一定的情形下提高了算法实现的实时性.结合矩阵分块算法,乘法器可对更高维的矩阵进行乘法求解,具有良好的扩展性.

     

    Abstract: In KALMAN filtering algorithm, a mass of matrix multiply operation exist. Considering various hardware acceleration technology, based on systolic array, a kind of floating matrix multiplier, with parallel architecture, in order to guarantee the real-time performance, is designed. Nine self-designed PE are integrated in the multiplier, the peak performance of which can reach 761.96 MFLOPS. In the company of Block Matrix Multiplication, the multiplier could be extended well and play a role in higher dimension matrix multiply operation.

     

/

返回文章
返回