How to validate LLM responses continuously in real time

Post Details

Company

Guardrails AI

Date Published

Jan. 17, 2024

Author

Safeer Mohiuddin

Word Count

2,030

Language

English

Hacker News Points

-

Source URL

www.guardrailsai.com/blog/validate-llm-responses-real-time

Summary

Guardrails AI leverages ChatGPT's streaming capabilities to offer real-time, accurate LLM responses by integrating advanced validation logic, ensuring that initial outputs are not only fast but also precise. Typically, ChatGPT's API calls are batched, delaying output until a full response is generated, but enabling streaming allows for immediate partial responses. This is advantageous for user experience, although it complicates output validation. Guardrails AI addresses this by validating each fragment of the LLM's streaming output against predefined specifications, using frameworks like Pydantic or RAIL for structured formats such as JSON. The system supports various output types, multiple on-failure behaviors, and works with any LLM provider that supports streaming, despite lacking reask and async callback support for now. A practical example demonstrates how Guardrails transforms unstructured text into structured data, enhancing storage and analysis capabilities, with streaming allowing users to receive validated data chunk by chunk, improving interaction efficiency and quality assurance.