Large Language Models (LLMs) are increasingly being used to handle specialized domains like medical or legal fields by injecting domain-specific knowledge through techniques such as Retrieval-Augmented Generation (RAG) or fine-tuning. This blog post introduces and evaluates a fine-tuning method called Retrieval Augmented Fine-Tuning (RAFT), which enhances LLMs by leveraging generated Chain of Thought (CoT) responses to improve reasoning and answer generation capabilities in specialized domains. RAFT refines pre-trained models by generating high-quality CoT answers with a large model and then fine-tuning these answers on smaller, specialized models, bridging the gap between general-purpose LLMs and the specialized knowledge needed for specific fields. Experiments with models like Llama2-7B and Llama3-8B demonstrate significant performance improvements, with RAFT consistently outperforming RAG methods. Additionally, the method is efficient, requiring less data and computational resources, making it feasible for compute-constrained environments. The cost-effectiveness and scalability of RAFT suggest its potential for broader application, with ongoing evaluations exploring its performance on newer models and possible deployment on platforms like Clarifai.