Abstract:
With the popularity of three-dimensional (3D) scanning devices, like the depth cameras and LiDARs, using point clouds to represent 3D data becomes ubiquitous. Compared with two-dimensional (2D) images, point clouds can provide richer information and capture more 3D structures. Therefore, point cloud learning has recently attracted a surge of research interests in computer vision community and promoted various emerging applications, such as robotic manipulation, autonomous driving and augmented reality. Generally, the learned representations of point clouds should have the characteristics of permutation invariant, transformation invariant (e.g., rotation and translation) and shape distinguishability. Therefore, in recent years, more and more researchers have carried out research on using deep learning (DL) to deal with point clouds. Among them, the convolution operations in convolutional neural networks (CNNs) have the characteristics of weight sharing, local aggregation and transformation invariance, which can effectively reduce the complexity of the networks and the number of training parameters. Meanwhile, CNNs have been successfully used to solve various 2D vision problems of images and videos with strong robustness. Therefore, CNNs attract great attention of researchers and are introduced into some point cloud tasks. However, the traditional standard convolution operations cannot directly act on the irregular data such as point clouds. Therefore, some researchers carry out in-depth explorations on the convolution operations and then propose a variety of convolutional strategies and networks to improve the computational efficiency and algorithm performance. To stimulate future research, we first summarize convolutional methods used in existing point cloud research, including projection-based methods, voxel-based methods, lattice-based methods, graph-based methods and point-based methods. After that, we focus on the recent progress in convolution operators and networks based on point clouds mainly including discrete convolutions and continuous convolutions. In addition, the performances of networks using various point-based convolution operators in some related tasks (such as classification and segmentation) are comprehensively analyzed. Then we quantitatively compare these methods on some synthetic datasets and real-scanned datasets, and obtain relative state-of-the-art (SOTA) methods of each point cloud task. Extensive experiments can verify the performances as well as the effectiveness of these proposed methods. Finally, aiming at some existing problems and challenges, we also present insightful observations together with inspiring future research directions.