DataDog metrics not populating
Incident Report for Apollo Graph, Inc.
Lag is happy, and we are all caught up.

Thanks as always for your patience!
Posted Aug 31, 2022 - 03:54 PDT
We've been able to identify the issues are not stemming solely from DataDog revoking keys. Our logs were cluttered with keys which had, in face, been revoked and thus masked the underlying issues we had on our services with lag. At this time we are seeing the lag dropping across our infrastructure and will update and close here in 20 minutes if this trend continues.

We apologize for the inconvenience, and the red herring for a fix presented earlier.
Posted Aug 31, 2022 - 03:23 PDT
We've confirmed that while for many graphs we are seeing elevated error rates talking with DataDog, we are still seeing successful metrics being sent for most graphs. At this time we are investigating whether our own internal monitoring is simply finicky, or whether we are seeing evidence of a number of client API tokens expiring/being revoked by DataDog.

If you are experiencing a delay in your DataDog metrics from Studio, please re-enable your API keys as a mitigation strategy. Docs for how to do this can be found here:
Posted Aug 31, 2022 - 03:06 PDT
We are continuing to investigate this issue at this time.
Posted Aug 31, 2022 - 03:01 PDT
We are currently seeing an elevated level of errors trying to forward Apollo Studio insights metrics to DataDog. At this time we're still investigating why this is occurring and what steps can be taken on our end to mitigate.

We will update again in 20 minutes.
Posted Aug 31, 2022 - 02:39 PDT
This incident affected: Notifications.