Testing Changi Airport's Chatbot - My Framer Site
Blog post from Guardrails AI
Singapore's Changi Airport tested its virtual concierge chatbot, AskMax, using the AI Verify Pilot in collaboration with the company Snowglobe to evaluate its performance in realistic simulated scenarios. AskMax, powered by a large language model, aims to deliver reliable, context-aware responses on topics such as check-in, transit, retail, and transport, across various platforms including the airport's website and mobile app. The large-scale simulation testing allowed for the identification of critical failure modes like hallucinations and off-topic responses, enabling thorough assessment of the chatbot's capabilities. By generating hundreds of diverse and realistic conversations, Snowglobe provided insights into previously overlooked issues, facilitating the adjustment of testing priorities to enhance user experience. This approach emphasized the importance of adaptive, data-driven methods in evaluating AI systems' behavior in live environments, highlighting the value of synthetic test data and automated judges for scalable evaluation.