王汉宁,孙浩,邓辰辰,等.面向SM3算法的高性能FPGA实现[J]. 微电子学与计算机,2023,40(7):105-110. doi: 10.19304/J.ISSN1000-7180.2022.0664
引用本文: 王汉宁,孙浩,邓辰辰,等.面向SM3算法的高性能FPGA实现[J]. 微电子学与计算机,2023,40(7):105-110. doi: 10.19304/J.ISSN1000-7180.2022.0664
WANG H N,SUN H,DENG C C,et al. High performance FPGA implementation for SM3 algorithm[J]. Microelectronics & Computer,2023,40(7):105-110. doi: 10.19304/J.ISSN1000-7180.2022.0664
Citation: WANG H N,SUN H,DENG C C,et al. High performance FPGA implementation for SM3 algorithm[J]. Microelectronics & Computer,2023,40(7):105-110. doi: 10.19304/J.ISSN1000-7180.2022.0664

面向SM3算法的高性能FPGA实现

High performance FPGA implementation for SM3 algorithm

  • 摘要: 现有SM3算法的高性能实现,主要采用多级流水线结构和不同关键路径优化策略,提升SM3算法实现的吞吐量. 但多级流水线形式的设计会消耗大量硬件资源. 本文首先充分挖掘了SM3算法在FPGA平台的可并行性,通过增加少量的寄存器,降低了算法关键路径的逻辑深度,并通过消息扩展与压缩函数并行执行的方法,仅用1211个LUT的逻辑资源实现了单核2.55 Gbit/s的吞吐量.相比已有方案单位逻辑资源的吞吐量提升了5.40倍,面积更小、功耗更低、性能更高.最终基于该结构设计了32核的SM3算法硬件,能够实现比已有64级流水线结构更高的吞吐量,且硬件开销更低,单位逻辑资源的吞吐量提升了2.27倍.

     

    Abstract: The high-performance implementation of the existing SM3 algorithm is mainly to adopt a multi-stage pipeline structure and different critical path optimization strategies, and improve the throughput achieved by the SM3 algorithm. However, multi-level pipelines require a lot of hardware resources. This paper first fully exploits the parallelism of the SM3 algorithm in the FPGA platform, reduces the logic depth of the algorithm critical path by adding a small number of registers, and uses only 1211 LUT logic resources to achieve the throughput of a single core of 2.55 Gbit/s by executing in parallel with the message extension and compression functions, which is 5.40 times higher than the throughput per logical resource of the existing program, with a smaller area, lower power consumption and higher performance. Finally, based on this structure, the SM3 algorithm hardware with 32 cores was designed, which could achieve higher throughput than the existing 64-level pipeline structure, and the hardware overhead was lower, and the throughput per logical resource was increased by 2.27 times.

     

/

返回文章
返回