Replication failures
## Problem Description, Impact, and Resolution At 19:50 UTC on June 10, 2026, we observed a small fraction of publishes originating from US-EAST-1 failing to replicate to subscribers globally. We removed the degraded publisher pod from service and the issue was resolved at 21:21 UTC on June 10, 2026. The root cause of the incident was triggered by a single process that fell into a degraded state where it continued receiving inbound traffic and passing health checks, but traffic sent outbound from the process was failing at an abnormally high rate. Our automated health check/recovery system did not auto-detect and replace the degraded process because its health check API reported itself as healthy. ## Mitigation Steps and Recommended Future Preventative Measures To prevent a similar issue from occurring in the future, we are improving the data within our health check APIs to return more complete performance metrics over a rolling time window. We are enhancing the issue detection logic to detect more patterns that infer process failure, even if the process itself is reporting as healthy.