Llama-agents is an open-source framework designed to streamline the development, iteration, and deployment of multi-agent AI systems, transforming them into production microservices. It offers a distributed service-oriented architecture where each agent can operate as an independent microservice, managed by a customizable LLM-powered control plane. Through standardized API interfaces and message queues, llama-agents facilitates communication and task distribution among agents, allowing developers to define agent interactions explicitly or rely on an orchestrator to determine task relevance. The framework emphasizes ease of deployment, scalability, and resource management, supported by observability tools to monitor system performance. Users can get started by installing llama-agents via pip and setting up a basic multi-agent system, with capabilities extending to real-time monitoring and more complex applications like a Query Rewriting RAG system. As an alpha release, llama-agents invites community feedback to enhance its features, providing a robust toolset for both prototyping and scaling AI systems to production.