How to Use Segment Anything Model - SAM 3 for FiftyOne
Blog post from Voxel51
Meta's Segment Anything Model 3 (SAM 3) is a groundbreaking advancement in computer vision, launched on November 19, 2025, that allows for detecting, segmenting, and tracking objects in images and videos using concept prompts. Unlike previous versions, SAM 3 incorporates open-vocabulary understanding, enabling it to segment any concept described in natural language, thus transforming traditional manual segmentation processes into intelligent, text-prompt-driven systems. The model features a unified architecture with a high-capacity Meta Perception Encoder for text and image encoding, a DETR-based promptable detector, and a memory-based video tracker, allowing it to perform comprehensive object detection and tracking across video frames. SAM 3's capabilities extend to real-world applications such as medical imaging, retail management, and autonomous vehicles, showcasing its versatility across industries. Furthermore, the model's design allows for fine-tuning on domain-specific tasks and integrates seamlessly with tools like FiftyOne, enhancing data management and visualization, which is crucial for leveraging SAM 3's full potential in production environments.