ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2021, Vol. 58 ›› Issue (8): 1705-1717.doi: 10.7544/issn1000-1239.2021.20210195

所属专题: 2021人工智能前沿进展专题

• 人工智能 • 上一篇    下一篇

基于融合多尺度标记信息的深度交互式图像分割

丁宗元1,孙权森1,王涛1,王洪元2   

  1. 1(南京理工大学计算机科学与技术学院 南京 210094);2(常州大学计算机与人工智能学院 江苏常州 213164) (dzyha2011@163.com)
  • 出版日期: 2021-08-01
  • 基金资助: 
    国家自然科学基金项目(61802188,61673220,61976028);江苏省自然科学基金项目(BK20180458);中国博士后科学基金项目(2020M681530)

Deep Interactive Image Segmentation Based on Fusion Multi-Scale Annotation Information

Ding Zongyuan1, Sun Quansen1, Wang Tao1, Wang Hongyuan2   

  1. 1(School of Computer Science and Technology, Nanjing University of Science and Technology, Nanjing 210094);2(School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou, Jiangsu 213164)
  • Online: 2021-08-01
  • Supported by: 
    This work was supported by the National Natual Science Foundation of China (61802188, 61673220, 61976028), the Natural Science Foundation of Jiangsu Province (BK20180458), and the China Postdoctoral Science Foundation (2020M681530).

摘要: 现有深度交互式图像分割算法通过对单击点计算距离映射或者高斯映射,然后将其与图像进行拼接作为网络的输入.每个单击点的影响范围是相同的,而每个交互的目的并不相同,早期交互的主要目的为选择,后期则更侧重微调.基于此,提出了融合多尺度标记信息的深度交互图像分割算法.首先,通过设置不同高斯半径,对每个单击点计算2组不同尺度的高斯映射.然后,融合小尺度高斯映射,并移除基础分割网络中的部分下采样模块,使网络提取更丰富的细节特征.同时,为了保持目标分割结果的完整性,提出了非局部特征注意力模块,该模块融合了大尺度高斯映射.最后,根据高斯映射提供的概率信息,提出了概率单击损失,提升目标在单击附近的分割表现.实验结果表明:提出的算法既能保持分割的完整性,又能得到目标细节的分割结果,大大降低了用户的交互负担.

关键词: 交互式图像分割, 深度学习, 多尺度标记, 高斯映射, 概率单击损失

Abstract: Existing deep interactive image segmentation algorithms calculate distance maps or Gaussian maps for the click-annotations, and then concatenate them with the image as the input of network. The influence range of each click is the same, but the purpose of each interaction is different. The main role of the early interaction is selection, and the later prefers fine-tuning. To this end, a deep interactive image segmentation algorithm fused with multi-scale label information is proposed. First, by setting different Gaussian radius, two groups of Gaussian maps with different scales are calculated for each click. Secondly, by fusing with the small scale Gaussian maps, and some down-sampling modules in the basic segmentation network are removed, hence richer detailed features of targets are extracted. At the same time, in order to maintain the integrity of the target segmentation results, a non-local feature attention module is proposed and this module fuses large scale Gaussian maps. Finally, according to the probability information provided by the Gaussian map, a probability click loss is proposed to enhance the segmentation performance of the target near the click point. Experimental results show that the proposed algorithm can not only maintain the integrity of the segmentation, but also obtain the segmentation results of the target details, which greatly reduces the user’s interaction burden.

Key words: interactive image segmentation, deep learning, multi-scale annotation, Gaussian maps, probability click loss

中图分类号: