基于深度学习的场景分割算法研究综述

张蕊; 李锦涛

doi:10.7544/issn1000-1239.2020.20190513

基于深度学习的场景分割算法研究综述

张蕊,
李锦涛

A Survey on Algorithm Research of Scene Parsing Based on Deep Learning

摘要

摘要: 场景分割的目标是判断场景图像中每个像素的类别.场景分割是计算机视觉领域重要的基本问题之一，对场景图像的分析和理解具有重要意义，同时在自动驾驶、视频监控、增强现实等诸多领域具有广泛的应用价值.近年来，基于深度学习的场景分割技术取得了突破性进展，与传统场景分割算法相比获得分割精度的大幅度提升.首先分析和描述场景分割问题面临的3个主要难点：分割粒度细、尺度变化多样、空间相关性强；其次着重介绍了目前大部分基于深度学习的场景分割算法采用的“卷积-反卷积”结构；在此基础上，对近年来出现的基于深度学习的场景分割算法进行梳理，介绍针对场景分割问题的3个主要难点，分别提出基于高分辨率语义特征图、基于多尺度信息和基于空间上下文等场景分割算法；简要介绍常用的场景分割公开数据集；最后对基于深度学习的场景分割算法的研究前景进行总结和展望.

Abstract: Scene parsing aims to predict the category of each pixel in a scene image. Scene parsing is a fundamental and important task in computer vision. It has great significance of analyzing and understanding scene images, and has a wide range of applications in many fields such as automatic driving, video surveillance, and augmented reality. Recently, scene parsing algorithm based on deep learning has a breakthrough, and achieves great improvement compared with the traditional scene parsing algorithms. In this survey, we firstly analyze and describe the three difficulties in scene parsing, including fine-grained parsing results, multiple scale deformations, and strong spatial relationships. Then we focus on the “convolutional-deconvolutional” framework which is widely used in most of the deep learning based scene parsing algorithms. Furthermore, we introduce the newly proposed scene parsing algorithm based on deep learning in recent years. To tackle the three difficulties in scene parsing, the recent deep learning based algorithms employ high-resolution feature maps, multi-scale information and contextual information to further improve the performance of scene parsing. After that, we briefly introduce the common public scene parsing datasets. Finally, we make the conclusion for scene parsing algorithm based on deep learning and point out some potential opportunities.

HTML全文

参考文献(0)

施引文献

资源附件(0)