Abstract:
Aiming at the problem that traditional convolutional neural networks easily lose a lot of useful information in the expression feature extraction stage, and cannot extract high discriminative expression features, which leads to the problem of low expression recognition rate, a facial expression recognition method based on multi-scale feature fusion and attention mechanism is proposed. First, use VGGNet16 to extract convolutional features, multi-scale feature fusion is performed on the output feature maps of different layers of convolutional layers in the network, and context information is introduced while extracting richer expression feature information. In order to focus on key expression features, an attention mechanism is introduced in the network, the channel attention module is improved by grouping convolution operations, learning the weight information of different channels, obtaining attention feature maps, enhancing the expression ability of features. In order to further improve the discriminability of the extracted expression features, an island loss function is introduced, and combined with the Softmax classification loss function to form a new loss function. Finally, due to the deletion of the fully connected layer, the DropBlock strategy is introduced in the convolution layer to prevent the network from over fitting. The experimental results show that the model has achieved average accuracy rates of 73.32% and 97.40% on the Fer 2013 and CK+ datasets.