Workflow-Level Resilience in Orkes Conductor: Timeouts and Failure Workflows
Blog post from Orkes
Orkes Conductor provides robust tools for building resilient, production-grade workflows by focusing on two main features: workflow timeouts and failure workflows. Workflow timeouts ensure that tasks do not exceed a specified duration, thereby preventing system stalls and ensuring compliance with service-level agreements (SLAs). This is exemplified in an e-commerce checkout scenario, where a 30-minute timeout prevents indefinite stalling of shopping carts. Failure workflows, on the other hand, act as contingency plans, triggering alternative workflows when the primary workflow encounters issues such as timeouts or unexpected errors. This is illustrated in a hotel booking case, where a failure workflow handles refunds and customer notifications if a booking cannot be completed. These features are crucial for maintaining system reliability, allowing workflows to self-regulate and recover from disruptions while ensuring a seamless user experience.