What Is YOLO-Depth?
Blog post from Roboflow
YOLO-Depth, part of the YOLO27 generation, is an upcoming monocular depth estimation model that predicts per-pixel distance from a single camera, set to release in September 2026. Unlike previous YOLO models, which focus on object detection and classification, YOLO-Depth adds a third dimension by determining how far away each object is, enhancing spatial decision-making without requiring additional sensors. This is particularly useful in scenarios like forklift proximity alerts, social distancing, and robotics grasping, where knowing the distance to objects is crucial. One of the main advantages is the cost-effectiveness of using existing single RGB cameras to provide 3D understanding, although the model's ability to output metric depth versus relative depth remains uncertain. While YOLO-Depth's benchmarks, model sizes, and licensing terms are still unknown, alternatives like the Depth Anything 3 model and RF-DETR with a Depth Estimation block are currently available for similar tasks. The potential of YOLO-Depth lies in transforming frame understanding into a spatial comprehension that could revolutionize applications across various industries.