Home / Companies / Zama / Blog / Post Details
Content Deep Dive

Concrete ML v1.8 : Towards Decentralized Private LLAMA Fine-Tuning

Blog post from Zama

Post Details
Company
Date Published
Author
Andrei Stoian
Word Count
449
Language
English
Hacker News Points
-
Summary

Concrete ML v1.8 introduces significant advancements in privacy-preserving fine-tuning for Large Language Models (LLMs) by enhancing the speed and usability of hybrid fine-tuning through an optimized Fully Homomorphic Encryption (FHE) backend and a new Low Rank Approximation API. This version now supports Python 3.12 and features a streamlined fine-tuning API inspired by HuggingFace PEFT LoraTrainer, allowing efficient use of GPUs for model evaluation and development acceleration. The FHE backend speeds up computation by leveraging GPU resources, implementing encrypted matrix - clear matrix multiplication, and achieving efficient compression, which results in encrypted data sizes only four times larger than non-encrypted data. Fine-tuning a LLAMA 8B model on 100,000 tokens takes around 70 hours at an estimated cost of $500 using a decentralized network of 100 consumer-grade GPUs, with expectations of further optimizations to reduce costs and latency. Concrete ML v1.8 thus facilitates secure, scalable, and decentralized AI solutions, with future updates promising continued improvements.