Github's webhooks system is the likely cause of recent downtime incidents, with high loads caused by widespread usage patterns. The service provides no visibility into webhooks being sent or failed, and does not implement retries, leading to lost webhooks forever. This highlights the importance of making webhooks more resilient through features such as retries and observability. Some experts suggest that webhooks can be a separate system from the main application, allowing them to scale independently while minimizing impact on the overall system.