Error Diagnostics in GraphOS
Understand where and when errors are happening in your graph
Errors can occur in different parts of your graph. They might originate in a subgraph or Connector or during communication between your router and upstream services. Errors may also arise due to security features such as safelisting, demand control, operation limits, and authentication. Pinpointing the location and type of errors is essential for effective diagnosis and resolution.
GraphOS Studio includes tools to help developers identify and classify errors across your graph. You view error insights from a variant's Insights page in the Errors tab.
Common use cases
GraphOS Studio's error insights help you detect and analyze issues in your graph. Use them to:
Spot trends in error behavior: Viewing error metrics over time can help reveal downtime (spikes in errors) or other unexpected health problems in your graph.
Diagnose service-specific issues: By grouping your graph's metrics by underlying service, you can identify unhealthy subgraphs and Connectors. Grouping by error code reveals what kind of errors are happening. Reviewing representative traces for the error can expose execution path issues.
Enterprise organizations can navigate from error metrics to representative traces to debug individual requests with errors. Traces help you identify specific situations where errors occur, surfacing details such as specific error messages and execution pathways leading to the error. Learn more.
Diagnose client-specific issues: By filtering your graph's metrics by client (and client version with Enterprise), you can identify when a specific client contributes or is responsible for a high failure rate for an operation. This information can help you isolate the underlying cause and push an update for the affected client.
Error categorization
The errors insights histogram visualizes the number of errors over time. You can group and filter error metrics by the dimensions available to GraphOS.
Depending on your graph and how you've configured it to report metrics, these dimensions are:
Path: The schema location where a specific error occurred
Service: The underlying subgraph or Connector the error originated from
(Error) Code: The specific error type
See router error reference for error codes emitted by the router
Subgraphs may convey their own codes
noteIf the code dimension's only value isunknown
, it means either code isn't available to your graph, or extended error reporting hasn't been enabled.
Metric dimensions availability
The dimensions available for grouping and filtering error metrics depend on your graph type:
Graph Type | Available Dimensions |
---|---|
Monographs | Only the path dimension is available for grouping and filtering. |
Supergraphs (using GraphOS Router or @apollo/gateway ) | Both path and service dimensions are available. |
Supergraphs (using GraphOS Router v2.1.2 or later) | Path, service are available, enable extended error reporting to access code. |
Setup
To take advantage of GraphOS error insights, ensure your router or server is properly set up to report error data.
From the Apollo Router
Once your cloud or self-hosted router is configured to send operation metrics to GraphOS, error metrics are automatically reported.
Extended error reportingSince 2.1.2
You can enable richer error reporting via preview_extended_error_metrics
and redaction_policy
router configurations:
1telemetry:
2 apollo:
3 errors:
4 preview_extended_error_metrics: enabled # (default: disabled)
5 subgraph:
6 all:
7 # By default, subgraphs should report errors to GraphOS
8 send: true # (default: true)
9 redaction_policy: extended # (default: strict)
10 subgraphs:
11 account: # Override the default behavior for the "account" subgraph
12 send: false # Optional - excludes reporting of errors for this subgraph
When enabled, the router sends metrics to Studio with additional attributes, including the service and code found in GraphQL error extensions (errors[].extensions.service
and errors[].extensions.code
, respectively).
For additional configuration options, including extended error reporting details, see the router guide for errors.
From Apollo Server
Apollo Server automatically reports error metrics as long as you follow these prerequisites:
You must first configure your server to send operation metrics to GraphOS.
To report errors:
Your GraphQL server must run Apollo Server v3.6 or later.
If you have a federated graph, your gateway must run Apollo Server v3.6 or later, but there are no requirements for your subgraphs.
To report trace-level error details:
Your GraphQL server can run any recent version of Apollo Server 2.x or 3.x.
If you have a federated graph, your subgraphs must support federated tracing. See the
federated tracing
item in this table to check for compatible libraries.