BERTs that chat: turn any BERT into a chatbot with dLLM
Blog post from HuggingFace
Researchers Zhanhui Zhou and Lingjie Chen have developed a method to transform a standard BERT model into a conversational chatbot using a minimal amount of open-source instruction-following data and the diffusion framework, showcasing that such adaptation requires only supervised finetuning rather than extensive generative pretraining. They introduce dLLM, an open-source framework that standardizes the training, inference, and evaluation of diffusion language models (DLMs), addressing current barriers such as the lack of a unified framework and high computational costs. The framework supports easy reproduction of experiments and includes open implementations of previously unavailable algorithms. Their ModernBERT-Chat model, finetuned on instruction-response pairs, demonstrates performance close to the Qwen1.5-0.5B benchmark across several tests, suggesting that the original masked language modeling pretraining of BERT already imparts sufficient knowledge for diffusion-based generation. The researchers encourage further community contributions to enhance dLLM's capabilities, aiming to make it a comprehensive platform for DLM research.
| Trend | Post Mentions | Total Month Mentions | Posts | Companies | MoM |
|---|---|---|---|---|---|
| AI Model Fine-tuning | 1 | 558 | 140 | 61 | -27% |
| LLM | 1 | 5,556 | 752 | 184 | +14% |