Abstract:
3D shape reconstruction aims to recover the 3D structure information of the scene by using image sequences with different focus levels. Most of the existing 3D shape reconstruction methods evaluate the focus level of the image sequence from a single scale, and guide the reconstruction process by introducing regularization or post-processing methods. Due to the limitation of the selection space of depth information, the reconstruction results often cannot converge effectively. To address this issue, this paper proposes a multi-scale cost aggregation framework for shape from focus, MSCAS. Firstly, non-downsampling multi-scale transformation is introduced to increase the depth information selection space of the input image sequence, and then the cost aggregation is performed by combining the intra-scale sequence correlation and the inter-scale information constraint. Through this expansion-aggregation mode, the doubling of scene depth representation information and the effective fusion of cross-scale and cross-sequence representation information are realized. As a general framework, the MSCAS framework can embed existing model design methods and deep learning methods to achieve performance improvement. The experimental results show that the MSCAS framework in this paper reduces the root mean square error (
RMSE) on average by 14.91% and improves the structural similarity (
SSIM) by 56.69% in the four datasets after embedding the model design class SFF method. After embedding the deep learning class SFF method, the
RMSE in the four datasets decreases by an average of 1.55% and the
SSIM increases by an average of 1.61%. These results verify the effectiveness of the MSCAS framework.