Paper arXiv Video Code Data

 

teaser

Consistent Depth without any post-processing. KITTI dataset1

Abstract

Self-supervised monocular depth estimation approaches suffer not only from scale ambiguity but also infer temporally inconsistent depth maps w.r.t. scale.

Enforcing_Temporal_Consistency_in_Video_Depth_Estimation

Two depth estimation results on four frames showing the inconsistency problem in existing MDE methods. Image taken from Li et. al.2

While disambiguating scale during training is not possible without some kind of ground truth supervision, having scale consistent depth predictions would make it possible to calculate scale once during inference as a post-processing step and use it over-time. With this as a goal, a set of temporal consistency losses that minimize pose inconsistencies over time are introduced. Evaluations show that introducing these constraints not only reduces depth inconsistencies but also improves the baseline performance of depth and ego-motion prediction.

teaser

Scale factors (GTpred\frac{\text{GT}}{\text{pred}}) within each KITTI sequences are highly varying

Introduction

For an introduction to self-supervised monocular depth estimation, checkout my previous blog post on Self-supervised Monocular Depth Estimation


BibTeX

@InProceedings{10.1007/978-3-031-31438-4_23,
author="Suri, Zeeshan Khan",
editor="Gade, Rikke and Felsberg, Michael and K{\"a}m{\"a}r{\"a}inen, Joni-Kristian",
title="Pose Constraints for Consistent Self-supervised Monocular Depth and Ego-Motion",
booktitle="Image Analysis",
year="2023",
publisher="Springer Nature Switzerland",
address="Cham",
pages="340--353",
isbn="978-3-031-31438-4",
doi={10.1007/978-3-031-31438-4_23}
}
Paper arXiv Video Code Data

References:

  1. A Geiger, P Lenz, C Stiller, and R Urtasun. 2013. Vision meets robotics: The KITTI dataset. Int. J. Rob. Res. 32, 11 (September 2013), 1231–1237. https://doi.org/10.1177/0278364913491297 

  2. Li, S., Luo, Y., Zhu, Y., Zhao, X., Li, Y., Shan, Y.: Enforcing temporal consistency in video depth estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops. pp. 1145–1154 (October 2021)