
doi: 10.1109/iccv51701.2025.01829 , 10.48550/arxiv.2509.15224 , 10.5281/zenodo.17672407 , 10.5281/zenodo.17672408
arXiv: 2509.15224
handle: 11585/1049131
doi: 10.1109/iccv51701.2025.01829 , 10.48550/arxiv.2509.15224 , 10.5281/zenodo.17672407 , 10.5281/zenodo.17672408
arXiv: 2509.15224
handle: 11585/1049131
Event cameras capture sparse, high-temporal-resolution visual information, making them particularly suitable for challenging environments with high-speed motion and strongly varying lighting conditions. However, the lack of large datasets with dense ground-truth depth annotations hinders learning-based monocular depth estimation from event data. To address this limitation, we propose a cross-modal distillation paradigm to generate dense proxy labels leveraging a Vision Foundation Model (VFM). Our strategy requires an event stream spatially aligned with RGB frames, a simple setup even available off-the-shelf, and exploits the robustness of large-scale VFMs. Additionally, we propose to adapt VFMs, either a vanilla one like Depth Anything v2 (DAv2), or deriving from it a novel recurrent architecture to infer depth from monocular event cameras. We evaluate our approach with synthetic and real-world datasets, demonstrating that i) our cross-modal paradigm achieves competitive performance compared to fully supervised methods without requiring expensive depth annotations, and ii) our VFM-based models achieve state-of-the-art performance.
ICCV 2025. Code: https://github.com/bartn8/depthanyevent/ Project Page: https://bartn8.github.io/depthanyevent/
event-cameras; depth estimation; Monocular Depth Estimation, FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), CNMS MOST, Computer Vision and Pattern Recognition
event-cameras; depth estimation; Monocular Depth Estimation, FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), CNMS MOST, Computer Vision and Pattern Recognition
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
