许强, 陈杰, 刘建, 王云, 胡哲琨. 一种适用于嵌入式图形处理器的多端口纹理Cache的设计[J]. 微电子学与计算机, 2013, 30(11): 27-30,34.
引用本文: 许强, 陈杰, 刘建, 王云, 胡哲琨. 一种适用于嵌入式图形处理器的多端口纹理Cache的设计[J]. 微电子学与计算机, 2013, 30(11): 27-30,34.
XU Qiang, CHEN Jie, LIU Jian, WANG Yun, HU Zhe-kun. The Design of Multiported Texture Cache for Embedded Graphics Processing Unit[J]. Microelectronics & Computer, 2013, 30(11): 27-30,34.
Citation: XU Qiang, CHEN Jie, LIU Jian, WANG Yun, HU Zhe-kun. The Design of Multiported Texture Cache for Embedded Graphics Processing Unit[J]. Microelectronics & Computer, 2013, 30(11): 27-30,34.

一种适用于嵌入式图形处理器的多端口纹理Cache的设计

The Design of Multiported Texture Cache for Embedded Graphics Processing Unit

  • 摘要: 为了提高嵌入式图形处理器的纹理单元效率,提出了一种多端口纹理高速缓存(Texture Cache)结构。该结构采用了基于块的光栅化和块交错的纹理内存组织,能够充分发掘数据间相关性,提高了Cache命中率;此外该结构采用Cache预取技术,有效隐藏了访存延迟;为了进一步提高数据吞吐率,设计了4个读端口,可支持并行读取4个纹素。仿真结果表明,设计的Cache可达到92%左右命中率,访存性能可达到零延迟内存系统的90%,数据吞吐率是单端口Cache的3~4倍。

     

    Abstract: In order to improve GPU's texture unit efficiency,a multiported texture cache architecture is proposed. The architecture employs a tiling rasterization order and a block interleaving memory organization,which can fully exploit the data locality and improve cache hit rate. In addition, the architecture employs cache prefetching technology,which can hide memory latency. In order to further improve data throughput, four read ports supporting 4 parallel reading are designed.Simulation results show that the hit rate of the proposed cache is about 92% and can attain 90% of the performance of a zero latency memory system.The data throughput is about 3~4 times of the single ported cache.

     

/

返回文章
返回