The emergence of lightweight, powerful open-source AI models like Microsoft's Phi-4 and Meta's Llama 3.2 has transformed the AI landscape, making it more accessible to developers. Open-source AI models are artificial intelligence models whose code, weights, and architecture are publicly available for anyone to view, use, modify, and distribute. These open-source tools for AI applications are cost-effective, highly customizable, and allow complete control over the data flow. However, identifying the best open-source embedding or generative model for your AI use case remains challenging due to computational resources, technical expertise, and time constraints. To simplify this process, Ollama and pgai can be used to enable experimentation with different models and quickly implement a retrieval-augmented generation (RAG) system using Microsoft's Phi-4 in PostgreSQL. Phi-4 excels in reasoning tasks, especially mathematics, outperforming even larger models like Gemini Pro 1.5. It is designed for research on large language models and use in general AI systems, primarily focusing on English. Ollama provides a unified interface for accessing and running embedding models and LLMs locally, abstracting API complexities and simplifying comparisons and experimentation. Pgai integrates embedding generation and response workflows into the PostgreSQL database, eliminating the need for external pipelines and enabling seamless interaction with your data. The pgai Vectorizer automates embedding generation and synchronization with source data using a single SQL command, saving time and computational resources. A generated vector representation of the user's query can be used to retrieve the most relevant chunks, which are then passed to the generative model for response generation. This stack allows developers to innovate with ease and speed by leveraging open-source AI tools like Ollama and pgai alongside Microsoft's Phi-4 and PostgreSQL.