Home / Companies / HuggingFace / Blog / Post Details
Content Deep Dive

🧠 I trained my own French LLM from scratch — alone, with a 1080 Ti, and the power went out ⚡🇫🇷

Blog post from HuggingFace

Post Details
Company
Date Published
Author
vloplok
Word Count
2,017
Language
-
Hacker News Points
-
Summary

A 20-year-old developer embarked on a solo project to build a French language model from scratch using a GTX 1080 Ti, aiming to understand every step of the process rather than simply fine-tuning an existing model. This involved creating a custom pipeline that included data collection, cleaning, tokenization, and training, with a model architecture inspired by LLaMA rather than traditional GPT-2. The dataset was derived from an AI-rewritten version of French Wikipedia to ensure uniformity in style, resulting in a 15-million-parameter model optimized for French. Training was structured over three phases—denoising, curriculum learning, and contrastive learning—though a power outage interrupted progress at the 10th epoch. Despite technical challenges and a limited computing setup, the project demonstrated the feasibility of developing a language model independently, with plans to expand the dataset and continue training on more robust cloud infrastructure to enhance the model's understanding and application across different domains.