Turn Your LLM into a Classifier for $2
Blog post from Fireworks AI
Large language models (LLMs), traditionally used for free-form text generation, can be effectively adapted for classification tasks by leveraging their inherent ability to model token probabilities. This adaptation does not require altering the model's architecture; instead, it involves mapping each class to specific tokens and using the model's next-token probabilities as class probabilities, which can be used in applications like safety moderation, routing, and intent classification. This method is cost-effective and maintains compatibility with standard fine-tuning and inference APIs, making it suitable for small to medium label sets. Fine-tuning naturally calibrates these probabilities to reflect real-world likelihoods, eliminating the need for explicit renormalization. Empirical validation using the AG News dataset demonstrated that fine-tuning on a platform like Fireworks can achieve accurate and well-calibrated class probabilities at a low cost, confirming that LLMs can be adapted for classification tasks without significant modifications.