Robert Nishihara and Philipp Moritz discuss the use of Tinker, a new API from Thinking Machines, designed for training large language models (LLMs), and its application with Ray to develop a text-to-SQL model. The process involves two main stages: data generation and model fine-tuning. Data generation is achieved by using Ray Serve and vLLM to deploy Qwen-8B for generating SQL queries at scale, with Ray Core executing parallel tasks to evaluate these queries in a SQL environment, filtering out unsuccessful ones. The successful queries are then used in the model fine-tuning phase with Tinker, which provides granular control over the training process, allowing for the fine-tuning of LLMs using a dataset created in the first stage. The authors also describe the evaluation of the model’s performance, detailing the extraction and merging of LoRA weights with the base model to address compatibility issues with vLLM. The article provides a comprehensive look at leveraging cutting-edge tools for optimizing AI model deployment and fine-tuning, supported by a detailed code appendix for implementation.