Company
Date Published
Author
Dr. Andreas Heindl
Word count
1456
Language
English
Hacker News points
None

Summary

The quality of a dataset directly impacts the performance and outcomes from training and production models, especially in medical imaging. Open-source medical imaging datasets can provide artificial intelligence start-ups with the data they need to develop their first diagnostic model into production. These datasets are useful because they're often ready to be labeled, contain identifiable patient data is rare, and come with metadata that's valuable for researchers and healthcare providers. Examples of high-quality open-source medical imaging datasets include MedPix, The Cancer Imaging Archive (TCIA), National COVID-19 Chest Imaging Database (NCCID), COVID-19 Image Dataset on Kaggle, CT Medical Images on Kaggle, The OASIS Datasets, MURA, re3data, NIH Deep Lesion Dataset, and NIH Chest X-Ray Dataset. These datasets are available for free or through collaborative platforms like Encord Annotate, which streamlines collaboration between medical professionals, machine learning teams, and annotators to accelerate the process of labeling medical imaging data.