Home / Companies / HuggingFace / Blog / Post Details
Content Deep Dive

BiomedBERT Small: Medical models at 22.7M parameters

Blog post from HuggingFace

Post Details
Company
Date Published
Author
David Mezzetti
Word Count
912
Language
-
Hacker News Points
-
Summary

The BiomedBERT Small series introduces a new range of compact medical models with 22.7 million parameters, positioned between the larger 110M BiomedBERT Base and the tiny BiomedBERT Hash models. Despite their smaller size, these models deliver strong performance in both speed and accuracy, notably outperforming the original PubMedBERT Embeddings model while using only 20% of its parameters. The series includes various specialized models such as biomedbert-small-embeddings and biomedbert-small-colbert, which have been fine-tuned using innovative training techniques like distillation from larger models. Evaluation across datasets such as PubMed QA and PubMed Summary reveals these models' competitive edge against larger counterparts and commonly used small models like all-MiniLM-L6-v2. Furthermore, fine-tuning efforts have significantly enhanced the original PubMedBERT Embeddings model's performance, making this series a valuable asset for medical literature tasks. Developed by NeuML, these models are available under an Apache 2.0 license and exemplify an advancement in creating efficient, accurate models suitable for various applications in medical data analysis.