Home / Companies / Roboflow / Blog / Post Details
Content Deep Dive

Create a Dataset Version

Blog post from Roboflow

Post Details
Company
Date Published
Author
-
Word Count
260
Language
English
Hacker News Points
-
Summary

Dataset versioning is crucial for maintaining reproducibility and reliability in machine learning projects, as it captures a snapshot of the dataset at a specific point in time, including the images, preprocessing, and augmentation steps used, which allows for accurate scientific testing across models and frameworks. Once created, a version is immutable, ensuring that changes to the project do not affect the data as it was at the time of versioning. To create a dataset version in Roboflow, users can navigate to the "Versions" section within their project, generate a new version by setting train/test/validation splits, and specify any preprocessing and augmentations. This version can be used directly for model training in Roboflow or exported for manual training, offering flexibility in managing and utilizing datasets. Additionally, users have the option to readjust the train/validation/test splits during the version creation process to better suit their needs.