张宏伟, 李涛, 冯臻夫, 贾蕊. 机器学习高性能SIMT处理器的设计与实现[J]. 微电子学与计算机, 2019, 36(9): 79-83.
引用本文: 张宏伟, 李涛, 冯臻夫, 贾蕊. 机器学习高性能SIMT处理器的设计与实现[J]. 微电子学与计算机, 2019, 36(9): 79-83.
ZHANG Hong-wei, LI Tao, FENG Zhen-fu, JIA Rui. Design and implementation of high performance SIMT processor for machine learning[J]. Microelectronics & Computer, 2019, 36(9): 79-83.
Citation: ZHANG Hong-wei, LI Tao, FENG Zhen-fu, JIA Rui. Design and implementation of high performance SIMT processor for machine learning[J]. Microelectronics & Computer, 2019, 36(9): 79-83.

机器学习高性能SIMT处理器的设计与实现

Design and implementation of high performance SIMT processor for machine learning

  • 摘要: 针对机器学习中出现的大数据量运算的问题, 自主研发了一款高性能SIMT (Single Instruction Multiple Threads)架构处理器.采用特殊的四级流水线结构, 通过可综合的Verilog HDL语言对电路进行描述, 完成了数据的多线程并行运算.在XiLinx公司VirtexUltraSacle系列的xcvu440-flga2892-2-e FPGA上搭建仿真验证平台对整体电路进行功能验证, 结果表明, 本设计电路满足多线程并行处理机制.采用SYNOPSYS公司Design-Compile在SMIC 65nm CMOS工艺标准单元库进行综合验证, 系统时钟最高工作频率为370 MHz, 系统最大功耗为4.251 mw.

     

    Abstract: Aiming at the problem of large data volume computing in machine learning, a high-performance SIMT (Single Instruction Multiple Threads) architecture processor was developed. Using a special four-stage pipeline structure, the circuit is described in a synthesizable Verilog HDL language, and multi-thread parallel computing of data is completed. The simulation verification platform was built on the xcvu440-flga2892-2-e FPGA of XiLinxVirtexUltraSacle series to verify the function of the whole circuit. The results show that the design circuit satisfies the multi-thread parallel processing mechanism. The SYNOPSYS Design-Compile is used for comprehensive verification in the SMIC 65 nm CMOS process standard cell library. The maximum operating frequency of the system clock is 370 MHz, and the maximum power consumption of the system is 4.251 mw.

     

/

返回文章
返回