Announcing Refuel LLM-2

Post Details

Company

Refuel

Date Published

May 7, 2024

Author

Refuel Team

Word Count

1,393

Language

English

Hacker News Points

-

Source URL

www.refuel.ai/blog-posts/announcing-refuel-llm-2

Summary

RefuelLLM-2 and RefuelLLM-2-small are new large language models specifically designed for data labeling, enrichment, and cleaning, demonstrating superior performance compared to state-of-the-art models like GPT-4-Turbo and Claude-3-Opus across a range of tasks. Trained on over 2750 datasets, these models show significant improvements in quality, even with long input contexts, and provide better-calibrated confidence scores. The models, built on the Mixtral-8x7B and Llama3-8B bases, underwent a two-phase training process to enhance their capabilities on both short and long context tasks. They support tasks across various domains and include non-public datasets to ensure generalization to real-world settings. RefuelLLM-2 and RefuelLLM-2-small are accessible via Refuel Cloud and open-sourced on Hugging Face, with the latter available under a CC BY-NC 4.0 license. The development process involved collaboration with several open-source projects and relied on substantial computational resources provided by partners like Databricks and GCP.