Company
Date Published
Author
Tran Le, Till Pieper, Gillian McGarvey
Word count
2833
Language
English
Hacker News points
2

Summary

The text discusses the implementation of a feature in Bits AI that uses large language models (LLMs) to facilitate the writing of postmortems after incidents, aiming to retain engineers' control and enhance learning while documenting incident details. This approach integrates structured metadata from Datadog’s Incident Management app and unstructured discussions from Slack to generate draft postmortems, allowing human authors to refine them. Challenges such as non-determinism, hallucinations, and the need for nuanced evaluation are highlighted, along with the requirement of a new skill set combining software engineering, product management, data science, and technical writing. The project explored various model alternatives for cost, speed, and quality trade-offs, while ensuring trust and privacy by scrubbing sensitive data and providing transparency. Experimentation and feedback loops were crucial for refining LLM outputs, which were generally effective in handling incidents of mid to lower severities. The authors propose future enhancements, including more customization and data integration for improved incident context, and indicate potential uses for LLM-generated content, such as custom postmortems for clients.