Measuring reliability risk in systems is crucial, as many organizations lack insight into how their services will react to failures, often relying solely on QA tests and engineer expertise. The concept of Reliability Scores addresses this by providing a metric based on regular resilience tests' results, which highlight reliability risks and facilitate actionable insights without unnecessary busywork. A valid reliability metric should be actionable, accountable without assigning blame, and accurate without noise, ensuring teams can trust and effectively utilize the data. By running standardized test suites and focusing on addressing risks rather than assigning blame, teams can systematically improve reliability and prevent customer-impacting outages. Gremlin's automated reliability platform exemplifies this approach, offering tools to identify and mitigate availability risks proactively.