AI-as-a-service (AIaaS) providers such as OpenAI, AWS, and Google Cloud have revolutionized the accessibility of AI models, enabling a wide range of applications to incorporate AI functionalities with ease. However, these services are not immune to failures, which can disrupt dependent applications. To ensure resilience, developers can implement strategies like circuit breakers, alternative routing, and fallback mechanisms. Gremlin, a tool for testing service reliability, allows users to simulate AIaaS failures to observe and strengthen their systems' responses. It offers tests for network outages, latency, and certificate expirations, enabling developers to identify and address potential vulnerabilities before they affect users. Regular testing and monitoring through tools like Gremlin can enhance the reliability of AI-powered services, ensuring consistent performance even during outages.