Resolved
We are confident that the issues with degraded performance have been resolved and we have not seen any further latency spikes since 1 PM Pacific time.
Monitoring
We have implemented improvements to our infrastructure that seem to have mitigated issues with high latency. We are continuing to monitor for any further disruptions.
Investigating
Our GraphQL API is currently experiencing higher than expected latency for a portion of requests. This may cause certain CLI operations (such as checks and publishes) to experience client-side timeouts while making GraphQL requests to our backend.
We are in the midst of investigating the cause. In the interim, there are two potential workarounds:
Increase the client timeout of your CLI. Specifically, the Apollo Rover client timeout is 30 seconds by default. We recommend you increase this to 150 seconds. We have observed that this configuration change allows enough time for the overwhelming majority of requests to complete. The steps for doing so are described in our docs athttps://www.apollographql.com/docs/rover/configuring/#increasing-request-timeoutsNote that this requires using at least Rover version 0.3.0.
If you cannot use Rover 0.3.0 or cannot alter your client timeout, we recommend you retry the CLI command. For checks, you should be able to retry immediately. For publishes, we recommend you wait 4 minutes before retrying.
We apologize for the inconvenience, and we thank you for your patience during this incident.