贾蕊, 李涛, 冯臻夫, 张宏伟. 高性能机器学习SIMT处理器的调度机制设计与实现[J]. 微电子学与计算机, 2019, 36(9): 67-72.
引用本文: 贾蕊, 李涛, 冯臻夫, 张宏伟. 高性能机器学习SIMT处理器的调度机制设计与实现[J]. 微电子学与计算机, 2019, 36(9): 67-72.
JIA Rui, LI Tao, FENG Zhen-fu, ZHANG Hong-wei. Design and implementation of scheduling mechanism for high performance machine learning SIMT processor[J]. Microelectronics & Computer, 2019, 36(9): 67-72.
Citation: JIA Rui, LI Tao, FENG Zhen-fu, ZHANG Hong-wei. Design and implementation of scheduling mechanism for high performance machine learning SIMT processor[J]. Microelectronics & Computer, 2019, 36(9): 67-72.

高性能机器学习SIMT处理器的调度机制设计与实现

Design and implementation of scheduling mechanism for high performance machine learning SIMT processor

  • 摘要: 针对面向机器学习的高性能单指令多线程(Single Instruction Multiple Threads, SIMT)处理器提出了结构简单且高效的调度机制, 支持4个区块、8个warp、64个线程的并行运算, 并采用两种可配置调度模式相结合的动态调度方式.该设计使用可综合的Verilog HDL语言实现其硬件电路, 并搭建基于FPGA的验证平台对整体电路进行功能验证, 结果表明, 本文设计的调度机制满足SIMT处理器需求, 且该调度机制使得处理器整体性能提升了82.17%.在Xilinx公司的FPGA芯片xcvu440-flga-2892-2-e上综合最大时钟频率可达到181 MHz.

     

    Abstract: Aiming at the high-performance Single Instruction Multiple Threads (SIMT) processor for machine learning, a simple and efficient scheduling mechanism is proposed, which supports parallel operation of 4 blocks, 8 warps and 64 threads. The dynamic scheduling method combines two configurable scheduling modes. The design uses the synthesizable Verilog HDL language to implement its hardware circuit, and builds an FPGA-based verification platform to verify the function of the whole circuit. The results show that the scheduling mechanism designed in this paper meets the requirements of SIMT processor, and the overall performance of the processor is increased by 82.17%. The integrated maximum clock frequency can reach 181MHz on Xilinx's FPGA chip xcvu440-flga-2892-2-e.

     

/

返回文章
返回