SynthVision: Building a 110K Synthetic Medical VQA Dataset with Cross-Model Validation

Post Details

Company

Hugging Face

Date Published

March 23, 2026

Author

Maziyar Panahi, merve, Jamie@Doubleword, Josh, Seb Ringrose, and Fergus Finn

Word Count

3,730

Company Posts That Month

63

Language

-

Hacker News Points

-

Post removed?

No

Source URL

huggingface.co/blog/OpenMed/synthvision

Summary

SynthVision is a collaborative project between OpenMed, Hugging Face, and Doubleword, which created a synthetic medical Visual Question Answering (VQA) dataset of 110,000 records using 119,000 annotated medical images. The dataset, built with two vision-language models (Qwen 3.5 and Kimi K2.5), achieved a 93% cross-validation agreement and was developed for under $500. The initiative aims to address the limited size and scope of existing medical VQA datasets, such as VQA-RAD, by transferring knowledge from large models to smaller ones through knowledge distillation. The project involved using Doubleword's API for efficient batch annotation and cross-validation, leading to fine-tuning of three small models (2-3 billion parameters) that improved performance across benchmarks, with the best model showing a 15% average exact match improvement. All data, code, and models have been open-sourced to encourage reproducibility and further research in the medical AI community.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
AI Model Fine-tuning	12	906	165	54	-16%
LLM	2	6,078	960	218	+18%
Real-time	2	6,457	1,307	242	+28%

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.