王婷,宣士斌,付孟丹,等.基于记忆单元与多尺度结构相似性的异常检测[J]. 微电子学与计算机,2023,40(8):28-36. doi: 10.19304/J.ISSN1000-7180.2022.0539
引用本文: 王婷,宣士斌,付孟丹,等.基于记忆单元与多尺度结构相似性的异常检测[J]. 微电子学与计算机,2023,40(8):28-36. doi: 10.19304/J.ISSN1000-7180.2022.0539
WANG T,XUAN S B,FU M D,et al. Anomaly detection based on the memory unit and multi-scale structural similarity[J]. Microelectronics & Computer,2023,40(8):28-36. doi: 10.19304/J.ISSN1000-7180.2022.0539
Citation: WANG T,XUAN S B,FU M D,et al. Anomaly detection based on the memory unit and multi-scale structural similarity[J]. Microelectronics & Computer,2023,40(8):28-36. doi: 10.19304/J.ISSN1000-7180.2022.0539

基于记忆单元与多尺度结构相似性的异常检测

Anomaly detection based on the memory unit and multi-scale structural similarity

  • 摘要: 针对基于记忆单元的自编码器模型(Dynamic Prototype Unit Model,DPU)在检测视频异常时没有充分利用多层次特征、未考虑异常与正常事件间的结构性差异的问题,提出融合多尺度记忆模块和多尺度结构相似性的异常检测模型. 新模型构建了多尺度记忆模块(Multi Scale Memory Module),利用不同尺度空间的记忆单元对编码层特征进行编码,并将编码结果与解码层特征拼接,既能保留网络的浅层细节信息,又能促进正常模式的多样性. 为了约束对正常事件中结构信息的学习,组合多尺度结构相似性(Multi Scale Structure Similarity Index ,MS-SSIM)误差与 L_1 误差作为目标函数,使预测视频中的事件结构更接近正常事件,提高视频中异常事件的预测误差. 在标准数据集UCSD Ped1、UCSD Ped2和Avenue数据集上的实验结果表明,提出模型的帧级AUC比原模型分别提高了0.8%、3.4%和1.0%,帧率达到142.9 fps.

     

    Abstract: In order to solve the problem that the dynamic prototype unit model based on memory unit does not make full use of multi-level features and does not consider the structural differences between abnormal and normal events when detecting video anomalies, an anomaly detection model combining multi-scale memory module and multi-scale structural similarity is proposed. The new model constructs a multi-scale memory Module, which uses memory units of different scale space to encode the features of the encoding layer, and concatenates the encoding results with the features of the decoding layer, which can not only preserve the shallow details of the network, but also promote the diversity of normal patterns. In order to constrain the learning of structural information in normal events, the multi-scale structure similarity index error and L_1 error are combined as objective functions to make the event structure in the predicted video closer to the normal event, and improve the prediction error of abnormal events in the video. The experimental results on the standard datasets UCSD Ped1, UCSD Ped2 and Avenue show that the frame level AUC of the proposed model is improved by 0.8%, 3.4% and 1.0% compared with the original model, respectively. And the frame rate reaches 142.9 FPS.

     

/

返回文章
返回