王林,田晨光.一种多网融合的分阶段考生行为识别检测算法[J]. 微电子学与计算机,2023,40(9):45-54. doi: 10.19304/J.ISSN1000-7180.2022.0691
引用本文: 王林,田晨光.一种多网融合的分阶段考生行为识别检测算法[J]. 微电子学与计算机,2023,40(9):45-54. doi: 10.19304/J.ISSN1000-7180.2022.0691
WANG L,TIAN C G. A phased behavior recognition and detection algorithm based on multi-network fusion[J]. Microelectronics & Computer,2023,40(9):45-54. doi: 10.19304/J.ISSN1000-7180.2022.0691
Citation: WANG L,TIAN C G. A phased behavior recognition and detection algorithm based on multi-network fusion[J]. Microelectronics & Computer,2023,40(9):45-54. doi: 10.19304/J.ISSN1000-7180.2022.0691

一种多网融合的分阶段考生行为识别检测算法

A phased behavior recognition and detection algorithm based on multi-network fusion

  • 摘要: 针对现有考场考生行为识别存在检测范围小、识别准确率不高的问题,提出了一种多网融合的考生行为识别检测算法. 考生定位采用轻量化检测网络Yolov4-Tiny并对其进行改进.首先,在主干部分嵌入通道空间双注意力机制CBAM,解决了考场中考生小目标和遮挡目标难以识别的问题. 其次,在特征提取后引入PPM金字塔池化结构,能够提高网络获取全局信息的能力. 然后,将改进后的网络融入Alphapose人体姿态估计模型提取出考生的骨骼关键点坐标信息.最后,通过时空图卷积神经网络ST-GCN进行行为分类. 实验表明,通过迁移学习的方式在数据集NTU-RGB+D得到预训练模型,最终在考生行为数据集上对4类行为识别的平均准确率达到了94.6%,能够有效的完成考场中考生的行为识别检测任务.

     

    Abstract: Aiming at the problems of small detection range and low recognition accuracy in the existing examinee behavior recognition in the examination room, a multi network fusion algorithm for examinee behavior recognition is proposed. The lightweight detection network Yolov4 Tiny is selected and improved for candidate positioning. First, channel space dual attention mechanism CBAM is embedded in the trunk, which solves the problem of difficult identification of small targets and occluded targets in the examination room. Secondly, the introduction of PPM pyramid pool structure after feature extraction can improve the network's ability to obtain global information. Then the improved network is integrated into the Alphapose human posture estimation model to extract the coordinate information of the examinee's skeleton key points, and finally the behavior classification is carried out through the space-time map convolution neural network ST-GCN. The experiment shows that the pre training model is obtained in the data set NTU-RGB+D by means of transfer learning, and the average accuracy rate of four types of behavior recognition on the examinee's behavior data set finally reaches 94.6%, which can effectively complete the examinee's behavior recognition and detection task in the examination room.

     

/

返回文章
返回