Humanloop partners with Stability AI to build the first open-source InstructGPT

Company

Humanloop

Date Published

Oct. 20, 2022

Author

Raza Habib

Word count

747

Language

English

Hacker News points

None

URL

humanloop.com/blog/stability-ai-partnership

Summary

Humanloop is collaborating with Carper AI, a Stability AI company, to develop a groundbreaking 70 billion parameter open-source large language model (LLM) that employs Reinforcement Learning from Human Feedback (RLHF) to enhance safety and usability in AI systems. This initiative aims to democratize the "instruction-tuning" of LLMs by adapting them for specific tasks through direct human feedback, making AI interactions as seamless as instructing a colleague. The project involves partnerships with Scale and Hugging Face, with the latter hosting the final model to make it widely accessible. Although traditional LLMs excel in tasks like code generation and writing assistance, their reliance on next word prediction often leads to inaccurate outputs and potential misuse. Training with RLHF addresses these issues by aligning models more closely with human feedback, thereby reducing risks such as misinformation and social bias while enhancing the models' practical utility. This open-source release, a pioneering effort in the field, is expected to drive extensive research and innovation, paving the way for new applications and companies to explore state-of-the-art AI systems.