How to Label Data with Grounded SAM 2
Blog post from Roboflow
Meta AI recently introduced Segment Anything 2 (SAM 2), an advanced model for image and video segmentation capable of generating segmentation masks for specified points or all objects in an image. While SAM 2 lacks object recognition, it can be integrated with Florence-2, a multimodal model, to create segmentation masks using text prompts. The blog post details a guide on labeling computer vision data with Grounded SAM 2—an ensemble of SAM 2 and Florence-2—using the Autodistill framework to auto-label data for training smaller models like YOLOv8. The guide walks readers through preparing a dataset, testing prompts, auto-labeling data, and training a model using Roboflow. It emphasizes the importance of finding effective prompts for accurate object identification and offers insights into using Roboflow's tools for annotation and model training. The process culminates in deploying the trained model either via the Roboflow API or on personal hardware using Roboflow Inference.