Training low-bit ternary models with Axolotl

Post Details

Company

HuggingFace

Date Published

April 30, 2026

Author

wing lian

Word Count

1,151

Language

-

Hacker News Points

-

Source URL

huggingface.co/blog/axolotl-ai-co/finetuning-ternary-llms-tii-axolotl

Summary

The collaboration between the Axolotl team and Younes Belkada of the FalconLLM team focuses on making low-bit ternary models, specifically 1.58-bit models, more accessible to the community by integrating the training of TII's Falcon BitNet series into Axolotl. This involves training models that are resilient to ternary format quantization—where weights are represented as -1, 0, or 1—by incorporating quantization errors during training to save memory and enhance efficiency on edge devices. The ternary weights achieve memory reductions compared to their bfloat16 counterparts, and the trained models can be fine-tuned and adapted using Axolotl's tools. Despite recent advancements, full support for BitNet models on GPU frameworks remains limited, though some progress has been made with CPU and Apple MLX support. The article also highlights the potential for future exploration of on-policy reinforcement learning methods for BitNet models and the need for further development of GPU support in popular serving frameworks.