What You'll Need to Run Falcon 180B In a Pod

Post Details

Company

RunPod

Date Published

Sept. 7, 2023

Author

Brendan McKeag

Word Count

653

Language

English

Hacker News Points

-

Source URL

www.runpod.io/blog/running-falcon-180b-in-runpod

Summary

On September 6th, the Technology Innovation Institute released Falcon-180B, the largest open-source large language model to date, surpassing the previous record held by BLOOM-176b and outperforming Llama-2 70B on the Hugging Face leaderboard. Falcon-180B demonstrates an ability to maintain coherence and avoid common pitfalls like confusion and boredom in tasks such as creative writing, even under challenging conditions. The model is accessible via a gated Hugging Face repository, requiring acceptance of a license agreement for download, and its substantial technical requirements include a need for significant VRAM, suggesting at least five A100 GPUs for effective use. While the model's computational demands can be prohibitive, alternative configurations, such as loading it in a 4-bit mode or using quantizations like GGUF, can reduce resource requirements, albeit with slower text generation speeds. Falcon-180B's release marks a significant milestone for the open-source community, offering a powerful tool that approaches the scale of proprietary models from companies like OpenAI.