상세 보기
- Kim, Hyunsu;
- Lee, Yeongseop;
- Ko, Hyunseong;
- Jeong, Junho;
- Son, Yunsik
WEB OF SCIENCE
0SCOPUS
0초록
Despite advancements in deep learning-based Monocular Depth Estimation (MDE), applying these models to video sequences remains challenging due to geometric ambiguities in texture-less regions and temporal instability caused by independent per-frame inference. To address these limitations, we propose STF-Depth, a novel post-processing framework that enhances depth quality by logically fusing heterogeneous information-geometric, semantic, and panoptic-without requiring additional retraining. Our approach introduces a robust RANSAC-based Vanishing Point Estimation to guide Dynamic Depth Gradient Correction for background separation, alongside Adaptive Instance Re-ordering to clarify occlusion relationships. Experimental results on the KITTI, NYU Depth V2, and TartanAir datasets demonstrate that STF-Depth functions as a universal plug-and-play module. Notably, it achieved a 25.7% reduction in Absolute Relative error (AbsRel) and significantly enhanced temporal consistency compared to state-of-the-art backbone models. These findings confirm the framework's practicality for real-world applications requiring geometric precision and video stability, such as autonomous driving, robotics, and augmented reality (AR).
키워드
- 제목
- Semantic-Guided Spatial and Temporal Fusion Framework for Enhancing Monocular Video Depth Estimation
- 저자
- Kim, Hyunsu; Lee, Yeongseop; Ko, Hyunseong; Jeong, Junho; Son, Yunsik
- 발행일
- 2025-12
- 유형
- Article
- 저널명
- Applied Sciences
- 권
- 16
- 호
- 1
- 페이지
- 1 ~ 26