曾素明,吴丽君.基于多任务学习的舌像分割与分类[J]. 微电子学与计算机,2023,40(10):20-28. doi: 10.19304/J.ISSN1000-7180.2022.0841
引用本文: 曾素明,吴丽君.基于多任务学习的舌像分割与分类[J]. 微电子学与计算机,2023,40(10):20-28. doi: 10.19304/J.ISSN1000-7180.2022.0841
ZENG S M,WU L J. Tongue image segmentation and multi-label classification based on multi-task learning[J]. Microelectronics & Computer,2023,40(10):20-28. doi: 10.19304/J.ISSN1000-7180.2022.0841
Citation: ZENG S M,WU L J. Tongue image segmentation and multi-label classification based on multi-task learning[J]. Microelectronics & Computer,2023,40(10):20-28. doi: 10.19304/J.ISSN1000-7180.2022.0841

基于多任务学习的舌像分割与分类

Tongue image segmentation and multi-label classification based on multi-task learning

  • 摘要: 针对舌体分割和单标签分类任务独立实现存在着难以提供舌诊所需的病理特征信息问题,通过共享层提取特征策略,提出联合舌体分割及多标签分类的多任务网络框架. 首先,共享层采用轻量化编码模块,并结合金字塔切分注意力解码块以融合舌像深浅层特征,提升共享层特征提取能力. 其次,舌像不同标签之间没有明显的关联性,难以对不同标签的相关性进行建模,因此设计了双流分支网络以实现多标签分类:其中一个分支基于自适应分割掩膜设计了背景屏蔽模块以提升舌裂纹识别性能,另一分支在编码块基础上使用空间金字塔池化实现舌苔分类. 最后,在早期训练过程中分割损失远小于分类损失,相等损失权重策略将导致分割任务无法学习到最优参数,为此, 通过优化的不确定性加权策略同时提升多个任务的性能. 实验表明,多任务学习框架在提取共享特征降低网络参数的同时,能有效联合优化各个任务,提升性能. 与Y-Net、MT-UNet等多任务学习网络相比,在舌体分割和多标签分类性能上均有提升.

     

    Abstract: To address the problem that independent implementation of tongue segmentation and single-label classification tasks have difficulty in providing the pathological feature information required by the tongue clinic, a multi-task network framework for joint tongue segmentation and multi-label classification is proposed through a shared layer extraction feature strategy. Firstly, shared layer adopts a lightweight module, combined with pyramid spilt attention to fuse the deep and shallow features of the tongue image, and improve the feature extraction ability of the shared layer. Secondly, there is no obvious correlation between different labels of tongue images, and it is difficult to model the correlation of different labels, so a two-stream branch network is designed to achieve multi-label classification: One of the branches designs a filter background module based on an adaptive segmentation mask to improve performance of tongue crack recognition, and the other branch uses spatial pyramid pooling on the basis of coding blocks to achieve tongue coating classification. Finally, in the early training process, the segmentation loss is much smaller than the classification loss, and the equal loss weighting strategy will result in the segmentation task not learning the optimal parameters, so the performance of multiple tasks is improved simultaneously by an optimized uncertainty weighting strategy. Experiments have proved that the multi-task learning can effectively jointly optimize each task and improve performance while extracting shared features and reducing network parameters. Compared with multi-task learning networks such as Y-Net and MT-UNet, it has better tongue segmentation and muti-label classification performance have been improved.

     

/

返回文章
返回