Metrics Ingress Serving 500

Incident Report for Apollo Graph, Inc.

Resolved

Everything has been returned to normal. Downstream lag from our surge of ingesting metrics reports also seems caught up.

The impact to customers here would be that you would see your metrics reports posts having a 500. In Apollo Server, this is a background thread and should be retried, with no traffic impact, and no data loss.

As always, we appreciate the patience
Posted Dec 05, 2022 - 17:20 UTC

Monitoring

A fix has been rolled out to resolve this issue. We are seeing our ingestion return to our normal levels. We will leave this open for 15 minutes to insure this is actually resolved for everyone.
Posted Dec 05, 2022 - 17:02 UTC

Update

We are continuing to investigate this issue.
Posted Dec 05, 2022 - 17:01 UTC

Investigating

We are currently investigating an issue with our Kubernetes routing where our Metrics Ingress has no healthy pods to be able to serve traffic.
Posted Dec 05, 2022 - 16:45 UTC
This incident affected: Metrics Ingestion.