Upgrading from Versions 1.x
Upgrade from version 1.x to 2.x of GraphOS Router
GraphOS Router v2.x includes various breaking changes when upgrading from v1.x, including removing deprecated features and renaming public interfaces to be more future-proof.
This upgrade guide describes the steps to upgrade your GraphOS Router deployment from version 1.x to 2.x. It describes breaking changes and how to resolve them. It also recommends new features to use.
Upgrade strategy
Before making any changes, auto-upgrade your configuration. This will remove options that already have no effect in v1.x, and make the rest of the upgrade easier.
Check the changes that will be applied using:
1router config upgrade --diff router.yaml
Then apply the changes using:
1router config upgrade router.yaml > router.next.yaml
2mv router.next.yaml router.yaml
Resource utilization changes
The 2.x release includes significant architectural improvements to enable support for backpressure. The router will now start rejecting requests when it is busy, instead of queueing them in memory. This change can cause changes in resource utilization, including increased CPU usage because the router can handle more requests.
During upgrade, carefully monitor logs and resource consumption to ensure that your router has successfully upgraded and that your router has enough resources to perform as expected.
Removals and deprecations
The following headings describe features that have been removed or deprecated in router v2.x. Alternatives to the removed or deprecated features are described, if available.
Removed metrics
Multiple metrics have been removed in router v2.x as part of evolving towards OpenTelemetry metrics and conventions. Each of the removed metrics listed below has a replacement metric or a method for deriving its value:
Removed
apollo_router_http_request_retry_total
. This is replaced byhttp.client.request.duration
metric'shttp.request.resend_count
attribute. Setdefault_requirement_level
torecommended
to make the router emit this attribute.Removed
apollo_router_timeout
. This metric conflated timed-out requests from client to the router, and requests from the router to subgraphs. Timed-out requests have HTTP status code 504. Use thehttp.response.status_code
attribute on thehttp.server.request.duration
metric to identify timed-out router requests, and the same attribute on thehttp.client.request.duration
metric to identify timed-out subgraph requests.Removed
apollo_router_http_requests_total
. This is replaced byhttp.server.request.duration
metric for requests from clients to router andhttp.client.request.duration
for requests from router to subgraphs.Removed
apollo_router_http_request_duration_seconds
. This is replaced byhttp.server.request.duration
metric for requests from clients to router andhttp.client.request.duration
for requests from router to subgraphs.Removed
apollo_router_session_count_total
. This does not have an equivalent in 2.0.0, though one may be introduced in a point release.Removed
apollo_router_session_count_active
. This is replaced byhttp.server.active_requests
.Removed
apollo_require_authentication_failure_count
. Use thehttp.server.request.duration
metric'shttp.response.status_code
attribute. Requests with authentication failures have HTTP status code 401.Removed
apollo_authentication_failure_count
. Use theapollo.router.operations.authentication.jwt
metric'sauthentication.jwt.failed
attribute.Removed
apollo_authentication_success_count
. Use theapollo.router.operations.authentication.jwt
metric instead. If theauthentication.jwt.failed
attribute is absent orfalse
, the authentication succeeded.Removed
apollo_router_deduplicated_subscriptions_total
. Use theapollo.router.operations.subscriptions
metric'ssubscriptions.deduplicated
attribute.Removed
apollo_router_cache_miss_count
. Cache miss count can be derived fromapollo.router.cache.miss.time
.Removed
apollo_router_cache_hit_count
. Cache hit count can be derived fromapollo.router.cache.hit.time
.
Removed processing time metrics
Calculating the overhead of injecting the router into your service stack when making multiple downstream calls is a complex task. Due to the router being unable to get reliable calculations, the metrics apollo_router_span
and apollo_router_processing_time
have been removed.
Upgrade step: test your workloads with the router and validate that its latency meets your requirements.
Removed custom instrumentation selectors
The subgraph_response_body
selector is removed in favor of subgraph_response_data
and subgraph_response_errors
.
Upgrade step: replace subgraph_response_body
with subgraph_response_data
and subgraph_response_errors
. For example:
1telemetry:
2 instrumentation:
3 instruments:
4 subgraph:
5 http.client.request.duration:
6 attributes:
7 http.response.status_code:
8 subgraph_response_status: code
9 my_data_value:
10 # Previously:
11 # subgraph_response_body: .data.test
12 subgraph_response_data: $.test # The data object is the root object of this selector
13 my_error_code:
14 # Previously:
15 # subgraph_response_body: .errors[*].extensions.extra_code
16 subgraph_response_errors: $[*].extensions.extra_code # The errors object is the root object of this selector
Scaffold no longer supported for Rust plugin code generation
Support for the cargo-scaffold
command to generate boilerplate source code for a Rust plugin has been removed in router v2.x.
Upgrade step: Source code generated using Scaffold will continue to compile, so existing Rust plugins will be unaffected by this change.
Removed configurable poll interval for Apollo Uplink
The configurable poll interval of Apollo Uplink has been removed in router v2.x.
Upgrade step: remove uses of both the --apollo-uplink-poll-interval
command-line argument and the APOLLO_UPLINK_POLL_INTERVAL
environment variable.
Removed hot reloading of supergraph URLs
Hot reloading is no longer supported for supergraph URLs configured via either the --supergraph-urls
command-line argument or the APOLLO_ROUTER_SUPERGRAPH_URLS
environment variable. In router v1.x, if hot reloading was enabled, the router would repeatedly fetch the URLs on the interval specified by --apollo-uplink-poll-interval
. This poll interval has been removed in v2.x.
Upgrade step: if you want to hot reload from a remote URL, try running a script that downloads the supergraph URL at a periodic interval, then point the router to the downloaded file on the filesystem.
Removed busy timer for request processing duration
In context::Context
that's typically used for router customizations, methods and structs related to request processing duration have been removed, because request processing duration is already included as part of spans sent by the
router. Users customizing the router with Rhai scripts, Rust scripts, or coprocessors don't need to track this information manually.
Upgrade step: remove calls and uses of the following methods and structs from context::Context
:
context::Context::busy_time()
context::Context::enter_active_request()
context::BusyTimer
structcontext::BusyTimerGuard
struct
Removed OneShotAsyncCheckpointLayer
and .oneshot_checkpoint_async()
Both OneShotAsyncCheckpointLayer
and .oneshot_checkpoint_async()
are removed as part of architectural optimizations in router v2.x.
Upgrade step:
Replace uses of
apollo_router::layers::ServiceBuilderExt::oneshot_checkpoint_async
with thecheckpoint_async
method.Replace uses of
OneShotAsyncCheckpointLayer
withAsyncCheckpointLayer
. For example:
Previous plugin code using OneShotAsyncCheckpointLayer
:
1OneShotAsyncCheckpointLayer::new(move |request: execution::Request| {
2 let request_config = request_config.clone();
3 // ...
4})
5.service(service)
6.boxed()
New plugin code using AsyncCheckpointLayer
:
1use apollo_router::layers::async_checkpoint_layer::AsyncCheckpointLayer;
2
3AsyncCheckpointLayer::new(move |request: execution::Request| {
4 let request_config = request_config.clone();
5 // ...
6})
7.buffered()
8.service(service)
9.boxed()
buffered()
method is provided by the apollo_router::layers::ServiceBuilderExt
trait and ensures that your service may be cloned.Removed deprecated methods of Rust plugins
The following deprecated methods are removed from the public crate API available to Rust plugins:
services::router::Response::map()
SchemaSource::File.delay
fieldConfigurationSource::File.delay
fieldcontext::extensions::sync::ExtensionsMutex::lock()
. UseExtensionsMutex::with_lock()
instead.test_harness::TestHarness::build()
. UseTestHarness::build_supergraph()
instead.PluginInit::new()
. UsePluginInit::builder()
instead.PluginInit::try_new()
. UsePluginInit::try_builder()
instead.
Removed Jaeger tracing exporter
The jaeger
exporter has been removed, as Jaeger now fully supports the OTLP format.
Upgrade step:
Change your router config to use the
otlp
exporter:
1telemetry:
2 exporters:
3 tracing:
4 propagation:
5 jaeger: true
6 otlp:
7 enabled: true
Ensure that you have enabled OTLP support in your Jaeger instance using
COLLECTOR_OTLP_ENABLED=true
and exposing ports4317
and4318
for gRPC and HTTP, respectively.
Emitting custom metrics
Rust plugins can no longer use the router's internal metrics system via tracing
macros. Consequently, tracing
field names that start with the following strings aren't interpreted as macros for router metrics:
counter.
histogram.
monotonic_counter.
value.
Upgrade step: instead of using tracing
macros , use OpenTelemetry crates. You can use the new apollo_router::metrics::meter_provider()
API to access the router's global meter provider to register your instruments.
tracing
event.Removed --schema
CLI argument
The deprecated --schema
command-line argument is removed in router v2.x
Upgrade step: replace uses of --schema
with router config schema
to print the configuration supergraph.
Removed automatically updating configuration at runtime
The ability to automatically upgrade configurations at runtime is removed. Previously, during configuration parsing/validation, the router 'upgrade migrations' would be applied automatically to generate a valid runtime representation of a config for the life of the executing process.
Automatic configuration upgrades can still be applied explicitly.
Upgrade step: use the router config
commands as shown at the top of the upgrade guide.
Configuration changes
The following describes changes to router configuration, including renamed options and changed default values.
Renamed metrics
Various metrics in router 2.x have been renamed to conform to the OpenTelemetry convention of using .
as the namespace separator, instead of _
.
Update step: use the updated names for the following metrics:
Previous metric | Renamed metric |
---|---|
apollo_router_opened_subscriptions | apollo.router.opened.subscriptions |
apollo_router_cache_hit_time | apollo.router.cache.hit.time |
apollo_router_cache_size | apollo.router.cache.size |
apollo_router_cache_miss_time | apollo.router.cache.miss.time |
apollo_router_state_change_total | apollo.router.state.change.total |
apollo_router_span_lru_size | apollo.router.exporter.span.lru.size * |
apollo_router_session_count_active | apollo.router.session.count.active |
apollo_router_uplink_fetch_count_total | apollo.router.uplink.fetch.count.total |
apollo_router_uplink_fetch_duration_seconds | apollo.router.uplink.fetch.duration.seconds |
apollo.router.exporter.span.lru.size
now also has an additional exporter
prefix.Changed trace default
In router v2.x, the trace telemetry.instrumentation.spans.mode
has a default value of spec_compliant
. Previously in router 1.x, its default value was deprecated
.
Changed defaults of GraphOS reporting metrics
Default values of some GraphOS reporting metrics have been changed from v1.x to the following in v2.x:
telemetry.apollo.signature_normalization_algorithm
now defaults toenhanced
. (In v1.x the default islegacy
.)telemetry.apollo.metrics_reference_mode
now defaults toextended
. (In v1.x the default isstandard
.)
Renamed configuration for Apollo operation usage reporting via OTLP
The router supports reporting operation usage metrics to GraphOS via OpenTelemetry Protocol (OTLP).
Prior to version 1.49.0 of the router, all GraphOS reporting was performed using a private tracing format. In v1.49.0, we introduced support for using OTel to perform this reporting. In v1.x, this is controlled using the otlp_tracing_sampler
(or experimental_otlp_tracing_sampler
prior to v1.61) flag, and it's off by default.
Now in v2.x, this flag is renamed to otlp_tracing_sampler
, and it's enabled by default.
Upgrade step: in your router config, replace uses of experimental_otlp_tracing_sampler
to otlp_tracing_sampler
.
Learn more about configuring usage reporting via OTLP.
Renamed context keys
The router request context is used to share data across stages of the request pipeline. The keys have been renamed to prevent conflicts and to better indicate which pipeline stage or plugin populates the data.
context: deprecated
in your router. For details, see Context configuration.Upgrade step: if you access context entries in a custom plugin, Rhai script, coprocessor, or telemetry selector, you can update your context keys to account for the new names:
Previous context key name | New context key name |
---|---|
apollo_authentication::JWT::claims | apollo::authentication::jwt_claims |
apollo_authorization::authenticated::required | apollo::authorization::authentication_required |
apollo_authorization::scopes::required | apollo::authorization::required_scopes |
apollo_authorization::policies::required | apollo::authorization::required_policies |
apollo_operation_id | apollo::supergraph::operation_id |
apollo_override::unresolved_labels | apollo::progressive_override::unresolved_labels |
apollo_override::labels_to_override | apollo::progressive_override::labels_to_override |
apollo_router::supergraph::first_event | apollo::supergraph::first_event |
apollo_telemetry::client_name | apollo::telemetry::client_name |
apollo_telemetry::client_version | apollo::telemetry::client_version |
apollo_telemetry::studio::exclude | apollo::telemetry::studio_exclude |
apollo_telemetry::subgraph_ftv1 | apollo::telemetry::subgraph_ftv1 |
cost.actual | apollo::demand_control::actual_cost |
cost.estimated | apollo::demand_control::estimated_cost |
cost.result | apollo::demand_control::result |
cost.strategy | apollo::demand_control::strategy |
experimental::expose_query_plan.enabled | apollo::expose_query_plan::enabled |
experimental::expose_query_plan.formatted_plan | apollo::expose_query_plan::formatted_plan |
experimental::expose_query_plan.plan | apollo::expose_query_plan::plan |
operation_kind | apollo::supergraph::operation_kind |
operation_name | apollo::supergraph::operation_name |
persisted_query_hit | apollo::apq::cache_hit |
persisted_query_register | apollo::apq::registered |
Context Keys for Coprocessors
The context key renames may impact your coprocessor logic. It can be tricky to update all context key usage together with the router upgrade. To aid this, the context
option for Coprocessors has been extended.
You can specify context: deprecated
to send all context with the old names, compatible with v1.x. Context keys are translated to their v1.x names before being sent to the coprocessor, and translated back to the v2.x names after being received from the coprocessor.
You can now also specify exactly which context keys you wish to send to a coprocessor by listing them under the selective
key. This will reduce the size of the request/response and may improve performance.
Upgrade step: Either upgrade your coprocessor to use the new context keys, or add context: deprecated
to your coprocessor configuration.
Example:
1coprocessor:
2 url: http://127.0.0.1:3000 # mandatory URL which is the address of the coprocessor
3 router:
4 request:
5 context: false # Do not send any context entries
6 supergraph:
7 request:
8 headers: true
9 context: # Send only these 2 context keys to your coprocessor
10 selective:
11 - apollo::supergraph::operation_name
12 - apollo::demand_control::actual_cost
13 body: true
14 response:
15 headers: true
16 context: all # Send all context keys with new names (2.x version)
17 body: true
18 subgraph:
19 all:
20 request:
21 context: deprecated # Send all the context keys with deprecated names (1.x version)
selective
context keys feature can not be used together with deprecated
names.Updated syntax for configuring supergraph endpoint path
The syntax for configuring the router to receive GraphQL requests at a specific URL path has been updated:
The syntax for named parameters was changed from a colon to braces:
1supergraph:
2 # Previously:
3 # path: /foo/:bar/baz
4 path: /foo/{bar}/baz
The syntax for wildcards was changed to require braces and a name:
1supergraph:
2 # Previously:
3 # path: /foo/*
4 path: /foo/{*rest}
Changed syntax for header propagation path
In router v2.x, the path used for selecting data from a client request body for header propagation must comply with the JSONPath spec. This means a $
is now required to select the root element.
Upgrade step: in your router config, prefix your paths with a $
when selecting root elements. For example:
1headers:
2 all:
3 request:
4 - insert:
5 name: from_app_name
6 # Previously:
7 # path: .extensions.metadata[0].app_name
8 path: $.extensions.metadata[0].app_name
Functionality changes
Updated tower
service pipeline
In router v1.x, a brand new tower::Service
pipeline was built for every request, so Rust plugin hooks were called for every request. Now in router v2.x, the tower::Service
pipeline is built once and cloned for every request.
Upgrade step: carefully audit how your Rust plugins store state in any tower
services you add to the pipeline, because the tower
service is now cloned for every request.
New capabilities
The following lists new capabilities in router v2.x that we recommend you use. These capabilities don't introduce breaking changes.
More granular logging with custom telemetry
Previously, router v1.x had an experimental experimental_when_header
feature to log requests and responses if a request header was set to a specific value. This feature provided very limited control:
1telemetry:
2 exporters:
3 logging:
4 # If one of these headers matches we will log supergraph and subgraphs requests/responses
5 experimental_when_header: # NO LONGER SUPPORTED
6 - name: apollo-router-log-request
7 value: my_client
8 headers: true # default: false
9 body: true # default: false
In router v2.x, you can achieve much more granular logging using custom telemetry. The example below logs requests and responses at every stage of the request pipeline:
1telemetry:
2 instrumentation:
3 events:
4 router:
5 request: # Display router request log
6 level: info
7 condition:
8 eq:
9 - request_header: apollo-router-log-request
10 - my_client
11 response: # Display router response log
12 level: info
13 condition:
14 eq:
15 - request_header: apollo-router-log-request
16 - my_client
17 supergraph:
18 request: # Display supergraph request log
19 level: info
20 condition:
21 eq:
22 - request_header: apollo-router-log-request
23 - my_client
24 response:
25 level: info
26 condition:
27 eq:
28 - request_header: apollo-router-log-request
29 - my_client
30 subgraph:
31 request: # Display subgraph request log
32 level: info
33 condition:
34 eq:
35 - supergraph_request_header: apollo-router-log-request
36 - my_client
37 response: # Display subgraph response log
38 level: info
39 condition:
40 eq:
41 - supergraph_request_header: apollo-router-log-request
42 - my_client
Improved traffic shaping
Traffic shaping has been improved significantly in router v2.x. We've added a new mechanism, concurrency control, and we've improved the router's ability to observe timeout and traffic shaping restrictions correctly. These improvements do mean that clients of the router may see an increase in errors as traffic shaping constraints are enforced:
We recommend that users experiment with their configuration in order to arrive at the right combination of timeout, concurrency and rate limit controls for their particular use case.
For more information about configuring traffic shaping.
Enforce introspection depth limit
To protect against abusive requests, the router enforces a depth limit on introspection queries by default.
Because the schema-introspection schema is recursive, a client can query fields of the types of some other fields at unbounded nesting levels, and this can produce responses that grow much faster than the size of the request. Consequently, the router by default refuses to execute introspection queries that nest list fields too deep and instead returns an error.
- The criteria matches
MaxIntrospectionDepthRule
in graphql-js, but may change in future versions. - In rare cases where the router rejects legitimate queries, you can configure the router to disable the limit by setting
limits.introspection_max_depth: false
. For example:
1# Do not enable introspection in production!
2supergraph:
3 introspection: true # Without this, schema introspection is entirely disabled by default
4limits:
5 introspection_max_depth: false # Defaults to true
Enforce valid CORS configuration
Previously in router v1.x, invalid values in the CORS configuration, such as malformed regexes, were ignored with an error logged.
Now in router 2.x, such invalid values in the CORS configuration prevent the router from starting up and result in errors like the following:
1could not create router: CORS configuration error:
Upgrade step**: Validate your CORS configuration. For details, go to CORS configuration documentation.
Deploy your router
Make sure that you are referencing the correct router release: v2.0.0
Reporting upgrade issues
If you encounter an upgrade issue that isn't resolved by this article, please search for existing Apollo Community posts and start a new post if you don't find what you're looking for.