Advanced Search
    Ai Haojun, Zhang Feng, Lü Pengfei, Tang Xuehua, Wang Zhongyuan. Improving Self-Supervised Monocular Indoor Depth Estimation with Local Feature Guidance[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440951
    Citation: Ai Haojun, Zhang Feng, Lü Pengfei, Tang Xuehua, Wang Zhongyuan. Improving Self-Supervised Monocular Indoor Depth Estimation with Local Feature Guidance[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440951

    Improving Self-Supervised Monocular Indoor Depth Estimation with Local Feature Guidance

    • In recent years, self-supervised monocular depth estimation methods have achieved impressive improvements. However, their performance degrades significantly when generating structured depth maps in complex indoor scenarios. To bridge this gap, focusing on the training process, we propose LoFtDepth, a novel method that combines self-supervised monocular depth estimation with local feature guided knowledge distillation. Firstly, an off-the-shelf depth estimation network is used to generate structured relative depth maps as depth priors. Local features are then extracted from these priors as boundary points, guiding the local depth refinement. This reduces the interference of depth-irrelevant features and transfers the boundary knowledge of depth priors to the self-supervised depth estimation network. Additionally, we introduce an inverse auto-mask weighted surface normal loss. This encourages normal directions of depth maps predicted by self-supervised network to align with those of depth priors in untextured regions. As a result, the depth estimation accuracy is enhanced. Finally, according to the coherence of camera motion, we impose a pose consistency constraint on residual pose estimation. This constraint enables effective adaptation to indoor scenes where camera poses change frequently, thereby mitigating training errors and boosting model performance. Extensive experiments on major indoor datasets demonstrate that LoFtDepth outperforms previous methods. It reduces the absolute relative error to 0.121, and successfully generates accurate and well-structured depth maps.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return