Using Public Datasets to Improve your Computer Vision Models

Post Details

Company

Roboflow

Date Published

Jan. 6, 2021

Author

Brad Dwyer

Word Count

749

Language

English

Hacker News Points

-

Source URL

blog.roboflow.com/using-public-datasets

Summary

Enhancing computer vision models can be significantly achieved by incorporating additional training data, particularly from public datasets, as demonstrated by adding images from OpenImages to a car detection model initially trained on the COCO dataset. This method increased the model's mean average precision (mAP) by 7.5%, highlighting the value of expanding datasets with varied examples to improve model accuracy. While larger datasets may see diminishing returns, strategically selecting relevant images can still yield improvements. Additionally, addressing model confusion by incorporating images of objects that are often misidentified can further refine model performance. Public datasets like COCO and OpenImages provide a wealth of labeled data across numerous classes, facilitating the enhancement of model robustness and precision.