Large language models (LLMs) are conceptualized as the core process of a novel operating system, with a context window that acts like RAM, enabling information retrieval from various sources for output generation. This process, known as retrieval augmented generation (RAG), is pivotal in LLM application development, offering a more straightforward alternative to complex fine-tuning for tasks requiring factual recall. The landscape of RAG methods is rapidly evolving, leading to some user confusion, prompting efforts to categorize and guide their use. Key RAG themes include query transformations to enhance retrieval robustness, dynamic query routing across diverse data stores, query construction using text-to-SQL or text-to-Cypher for structured data, and indexing strategies like optimizing chunk size and document embedding. Post-processing of retrieved documents is crucial due to the limited context window, with methods like re-ranking and classification to improve document diversity and relevance. Future plans involve leveraging open-source models for specific RAG tasks and developing benchmarks using public datasets to evaluate these approaches.