Slough - spurious 503 errors

Incident Report for Simwood

Resolved

As discussed in the community Slack channel, we had credible reports of unexpected 503 errors from some calls routed to our Slough Availability Zone. Volumes were also showing as lower than normal in our own telemetry.

On investigation, one of the edge routing containers was in an unusual state whereby it was unable to route calls onwards internally. It was up and otherwise responding normally. This was rectified by restarting it.

We did not force a DNS change on this occasion as the issue was resolved quickly once credible reports were received, but appropriately configured customer equipment should have respected published SRV records and retried via another site.

We will log a non-conformance report internally as our own monitoring should have detected this situation, and will need work to do so.

Posted Aug 05, 2019 - 18:02 UTC

This incident affected: Voice.