Concrete ML v1.8 : Towards Decentralized Private LLAMA Fine-Tuning

Post Details

Company

Zama

Date Published

Jan. 14, 2025

Author

Andrei Stoian

Word Count

449

Language

English

Hacker News Points

-

Source URL

www.zama.org/post/concrete-ml-v1-8

Summary

Concrete ML v1.8 introduces significant advancements in privacy-preserving fine-tuning for Large Language Models (LLMs) by enhancing the speed and usability of hybrid fine-tuning through an optimized Fully Homomorphic Encryption (FHE) backend and a new Low Rank Approximation API. This version now supports Python 3.12 and features a streamlined fine-tuning API inspired by HuggingFace PEFT LoraTrainer, allowing efficient use of GPUs for model evaluation and development acceleration. The FHE backend speeds up computation by leveraging GPU resources, implementing encrypted matrix - clear matrix multiplication, and achieving efficient compression, which results in encrypted data sizes only four times larger than non-encrypted data. Fine-tuning a LLAMA 8B model on 100,000 tokens takes around 70 hours at an estimated cost of $500 using a decentralized network of 100 consumer-grade GPUs, with expectations of further optimizations to reduce costs and latency. Concrete ML v1.8 thus facilitates secure, scalable, and decentralized AI solutions, with future updates promising continued improvements.