Home / Companies / HuggingFace / Blog / Post Details
Content Deep Dive

DeepFabric: Generate, Train and Evaluate with Datasets curated for Model Behavior Training.

Blog post from HuggingFace

Post Details
Company
Date Published
Author
Luke Hinds
Word Count
3,284
Language
-
Hacker News Points
-
Summary

DeepFabric is an open-source framework developed to train language models for complex tool-calling tasks, addressing common pitfalls such as incorrect tool usage and data generation challenges. It produces structurally valid training datasets by generating diverse, domain-specific tool call samples with contextually relevant reasoning traces, ensuring models learn both tool mechanics and decision-making processes. DeepFabric employs novel algorithms that construct topic trees or graphs, maintaining high diversity and low duplication without straying from the intended domain. The framework supports single-turn and multi-turn conversation structures, integrating reasoning styles such as freetext and agent reasoning to capture the thought processes behind tool selection. It allows customization of tools using YAML definitions, ensuring generated samples match exact API specifications. Designed for seamless integration with the HuggingFace ecosystem, DeepFabric simplifies the training pipeline, from dataset generation to model evaluation, and supports a variety of training frameworks, ensuring robust and contextually aware tool-calling agents are developed efficiently.