杜延沛, 贺占庄, 刘彬, 吴伟俊. 基于强化学习和分块并行的演化硬件方法[J]. 微电子学与计算机, 2022, 39(7): 79-85. DOI: 10.19304/J.ISSN1000-7180.2021.1114
引用本文: 杜延沛, 贺占庄, 刘彬, 吴伟俊. 基于强化学习和分块并行的演化硬件方法[J]. 微电子学与计算机, 2022, 39(7): 79-85. DOI: 10.19304/J.ISSN1000-7180.2021.1114
DU Yanpei, HE Zhanzhuang, LIU Bin, WU Weijun. An evolutionary hardware approach based on reinforcement learning and block parallelism[J]. Microelectronics & Computer, 2022, 39(7): 79-85. DOI: 10.19304/J.ISSN1000-7180.2021.1114
Citation: DU Yanpei, HE Zhanzhuang, LIU Bin, WU Weijun. An evolutionary hardware approach based on reinforcement learning and block parallelism[J]. Microelectronics & Computer, 2022, 39(7): 79-85. DOI: 10.19304/J.ISSN1000-7180.2021.1114

基于强化学习和分块并行的演化硬件方法

An evolutionary hardware approach based on reinforcement learning and block parallelism

  • 摘要: 演化硬件(EvolvableHardware,EHW)是可编程逻辑器件和进化算法的结合,可根据不同演化目标自主动态调整自身电路结构.在演化硬件方法中,由于其自演化特性和上层遗传算法为参数敏感型,面对不同演化对象自适应性较差.同时遗传算法有早熟缺陷,在大型演化目标后期经常无法演化到目标真值表,成功率较低.本文在传统演化硬件方法上,改进为基于强化学习和分块并行的演化硬件方法,并分三阶段进行演化.第一阶段使用基于强化学习的RLGA算法获得参数自学习能力,提高自适应性.第二阶段使用上一阶段学习到参数演化一定代数.第三阶段使用分块并行演化方法,提高末端演化能力,最终提高演化成功率.使用C语言对传统方法和三阶段法进行仿真比较,结果表明三阶段方法在面对大型真值表演化目标时可缩小演化硬件的演化代数,演化成功率提升至95%以上.

     

    Abstract: Evolvable Hardware (EHW) is a combination of programmable logic devices and evolutionary algorithms, which can dynamically adjust its own circuit structure according to different evolutionary goals.In the evolutionary hardware method, due to its self-evolution characteristics and the parameter sensitivity of the upper genetic algorithm, the adaptability to different evolutionary objects is poor.At the same time, genetic algorithm has the prematurity defect, often cannot evolve to the target truth table in the late stage of large evolutionary target, and the success rate is low.In this paper, an evolutionary hardware method based on reinforcement learning and block parallelism is improved from the traditional evolutionary hardware method, and the evolution is carried out in three stages.In the first stage, RLGA algorithm based on reinforcement learning is used to obtain parameter self-learning ability and improve self-adaptability.The second stage uses the previous stage to learn a certain algebra of parameter evolution.In the third stage, the block parallel evolution method is used to improve the terminal evolution ability and finally improve the success rate of evolution.The results show that the three-stage method can reduce the evolution algebra of the hardware when facing the evolution target of large truth table, and the success rate of evolution can be increased to more than 95%.

     

/

返回文章
返回