Metrics Ingestion: High error rates & processing lag
Incident Report for Apollo
Resolved
All systems appear to be operational.

We are continuing to monitor to see if there are any performance regressions.

We anticipate a large load event around 8pm PT, but are not expecting any degraded performance as a result.

Any updates will be posted here on the status page.

As always, thank you for your patience while we work to resolve issues.
Posted Sep 20, 2021 - 17:50 PDT
Update
We are seeing availability across metrics ingestion and processing.

We are still monitoring the situation as we make changes to our infrastructure to accommodate for additional scale.

The Launches UI will remain disabled until further notice as we improve the stability of the infrastructure.
Posted Sep 20, 2021 - 16:29 PDT
Update
We have disabled the new Launches UI as part of the effort to keep the rest of the system stable.

Until further notice, the Launches page will not be updated to reflect new launches, but the new launches will continue to happen.
Posted Sep 20, 2021 - 15:54 PDT
Update
We are continuing to see some intermittent errors with metrics ingestion.

We are continuing to scale our infrastructure. We are not seeing permanent data loss, but we are seeing processing lag.

We will continue to keep this page updated with any developments.
Posted Sep 20, 2021 - 15:23 PDT
Monitoring
Metrics ingestion is back to being generally available.

We've scaled up our replicas and are continuing to monitor the situation.
Posted Sep 20, 2021 - 15:06 PDT
Investigating
Metrics ingestion has degraded performance, and we are seeing some lag in processing ingested metrics and traces.

We're currently investigating the issue.
Posted Sep 20, 2021 - 14:47 PDT
This incident affected: Metrics Ingestion.