Upgrading from Versions 1.x

Upgrade from version 1.x to 2.x of GraphOS Router


GraphOS Router v2.x includes various breaking changes when upgrading from v1.x, including removing deprecated features and renaming public interfaces to be more future-proof.

This upgrade guide describes the steps to upgrade your GraphOS Router deployment from version 1.x to 2.x. It describes breaking changes and how to resolve them. It also recommends new features to use.

Upgrade strategy

Before making any changes, auto-upgrade your configuration. This will remove options that already have no effect in v1.x, and make the rest of the upgrade easier.

Check the changes that will be applied using:

Bash
1router config upgrade --diff router.yaml

Then apply the changes using:

Bash
1router config upgrade router.yaml > router.next.yaml
2mv router.next.yaml router.yaml

Resource utilization changes

The 2.x release includes significant architectural improvements to enable support for backpressure. The router will now start rejecting requests when it is busy, instead of queueing them in memory. This change can cause changes in resource utilization, including increased CPU usage because the router can handle more requests.

During upgrade, carefully monitor logs and resource consumption to ensure that your router has successfully upgraded and that your router has enough resources to perform as expected.

Removals and deprecations

The following headings describe features that have been removed or deprecated in router v2.x. Alternatives to the removed or deprecated features are described, if available.

Removed metrics

Multiple metrics have been removed in router v2.x as part of evolving towards OpenTelemetry metrics and conventions. Each of the removed metrics listed below has a replacement metric or a method for deriving its value:

  • Removed apollo_router_http_request_retry_total. This is replaced by http.client.request.duration metric's http.request.resend_count attribute. Set default_requirement_level to recommended to make the router emit this attribute.

  • Removed apollo_router_timeout. This metric conflated timed-out requests from client to the router, and requests from the router to subgraphs. Timed-out requests have HTTP status code 504. Use the http.response.status_code attribute on the http.server.request.duration metric to identify timed-out router requests, and the same attribute on the http.client.request.duration metric to identify timed-out subgraph requests.

  • Removed apollo_router_http_requests_total. This is replaced by http.server.request.duration metric for requests from clients to router and http.client.request.duration for requests from router to subgraphs.

  • Removed apollo_router_http_request_duration_seconds. This is replaced by http.server.request.duration metric for requests from clients to router and http.client.request.duration for requests from router to subgraphs.

  • Removed apollo_router_session_count_total. This does not have an equivalent in 2.0.0, though one may be introduced in a point release.

  • Removed apollo_router_session_count_active. This is replaced by http.server.active_requests.

  • Removed apollo_require_authentication_failure_count. Use the http.server.request.duration metric's http.response.status_code attribute. Requests with authentication failures have HTTP status code 401.

  • Removed apollo_authentication_failure_count. Use the apollo.router.operations.authentication.jwt metric's authentication.jwt.failed attribute.

  • Removed apollo_authentication_success_count. Use the apollo.router.operations.authentication.jwt metric instead. If the authentication.jwt.failed attribute is absent or false, the authentication succeeded.

  • Removedapollo_router_deduplicated_subscriptions_total. Use the apollo.router.operations.subscriptions metric's subscriptions.deduplicated attribute.

  • Removed apollo_router_cache_miss_count. Cache miss count can be derived from apollo.router.cache.miss.time.

  • Removed apollo_router_cache_hit_count. Cache hit count can be derived from apollo.router.cache.hit.time.

Removed processing time metrics

Calculating the overhead of injecting the router into your service stack when making multiple downstream calls is a complex task. Due to the router being unable to get reliable calculations, the metrics apollo_router_span and apollo_router_processing_time have been removed.

Upgrade step: test your workloads with the router and validate that its latency meets your requirements.

Removed custom instrumentation selectors

The subgraph_response_body selector is removed in favor of subgraph_response_data and subgraph_response_errors.

Upgrade step: replace subgraph_response_body with subgraph_response_data and subgraph_response_errors. For example:

YAML
1telemetry:
2  instrumentation:
3    instruments:
4      subgraph:
5        http.client.request.duration:
6          attributes:
7            http.response.status_code:
8              subgraph_response_status: code
9            my_data_value:
10              # Previously:
11              # subgraph_response_body: .data.test
12              subgraph_response_data: $.test # The data object is the root object of this selector
13            my_error_code:
14              # Previously:
15              # subgraph_response_body: .errors[*].extensions.extra_code
16              subgraph_response_errors: $[*].extensions.extra_code # The errors object is the root object of this selector

Scaffold no longer supported for Rust plugin code generation

Support for the cargo-scaffold command to generate boilerplate source code for a Rust plugin has been removed in router v2.x.

Upgrade step: Source code generated using Scaffold will continue to compile, so existing Rust plugins will be unaffected by this change.

The configurable poll interval of Apollo Uplink has been removed in router v2.x.

Upgrade step: remove uses of both the --apollo-uplink-poll-interval command-line argument and the APOLLO_UPLINK_POLL_INTERVAL environment variable.

Removed hot reloading of supergraph URLs

Hot reloading is no longer supported for supergraph URLs configured via either the --supergraph-urls command-line argument or the APOLLO_ROUTER_SUPERGRAPH_URLS environment variable. In router v1.x, if hot reloading was enabled, the router would repeatedly fetch the URLs on the interval specified by --apollo-uplink-poll-interval. This poll interval has been removed in v2.x.

Upgrade step: if you want to hot reload from a remote URL, try running a script that downloads the supergraph URL at a periodic interval, then point the router to the downloaded file on the filesystem.

Removed busy timer for request processing duration

In context::Context that's typically used for router customizations, methods and structs related to request processing duration have been removed, because request processing duration is already included as part of spans sent by the router. Users customizing the router with Rhai scripts, Rust scripts, or coprocessors don't need to track this information manually.

Upgrade step: remove calls and uses of the following methods and structs from context::Context:

  • context::Context::busy_time()

  • context::Context::enter_active_request()

  • context::BusyTimer struct

  • context::BusyTimerGuard struct

Removed OneShotAsyncCheckpointLayer and .oneshot_checkpoint_async()

Both OneShotAsyncCheckpointLayer and .oneshot_checkpoint_async() are removed as part of architectural optimizations in router v2.x.

Upgrade step:

  • Replace uses of apollo_router::layers::ServiceBuilderExt::oneshot_checkpoint_async with the checkpoint_async method.

  • Replace uses of OneShotAsyncCheckpointLayer with AsyncCheckpointLayer. For example:

Previous plugin code using OneShotAsyncCheckpointLayer:

Rust
1OneShotAsyncCheckpointLayer::new(move |request: execution::Request| {
2    let request_config = request_config.clone();
3    // ...
4})
5.service(service)
6.boxed()

New plugin code using AsyncCheckpointLayer:

Rust
1use apollo_router::layers::async_checkpoint_layer::AsyncCheckpointLayer;
2
3AsyncCheckpointLayer::new(move |request: execution::Request| {
4    let request_config = request_config.clone();
5    // ...
6})
7.buffered()
8.service(service)
9.boxed()
note
The buffered() method is provided by the apollo_router::layers::ServiceBuilderExt trait and ensures that your service may be cloned.

Removed deprecated methods of Rust plugins

The following deprecated methods are removed from the public crate API available to Rust plugins:

  • services::router::Response::map()

  • SchemaSource::File.delay field

  • ConfigurationSource::File.delay field

  • context::extensions::sync::ExtensionsMutex::lock(). Use ExtensionsMutex::with_lock() instead.

  • test_harness::TestHarness::build(). Use TestHarness::build_supergraph() instead.

  • PluginInit::new(). Use PluginInit::builder() instead.

  • PluginInit::try_new(). Use PluginInit::try_builder() instead.

Removed Jaeger tracing exporter

The jaeger exporter has been removed, as Jaeger now fully supports the OTLP format.

Upgrade step:

  • Change your router config to use the otlp exporter:

YAML
router.yaml
1telemetry:
2  exporters:
3    tracing:
4      propagation:
5        jaeger: true
6      otlp:
7        enabled: true
  • Ensure that you have enabled OTLP support in your Jaeger instance using COLLECTOR_OTLP_ENABLED=true and exposing ports 4317 and 4318 for gRPC and HTTP, respectively.

Emitting custom metrics

Rust plugins can no longer use the router's internal metrics system via tracing macros. Consequently, tracing field names that start with the following strings aren't interpreted as macros for router metrics:

  • counter.

  • histogram.

  • monotonic_counter.

  • value.

Upgrade step: instead of using tracing macros , use OpenTelemetry crates. You can use the new apollo_router::metrics::meter_provider() API to access the router's global meter provider to register your instruments.

note
The router v2.x logs an error for each legacy metric field it detects in a tracing event.

Removed --schema CLI argument

The deprecated --schema command-line argument is removed in router v2.x

Upgrade step: replace uses of --schema with router config schema to print the configuration supergraph.

Removed automatically updating configuration at runtime

The ability to automatically upgrade configurations at runtime is removed. Previously, during configuration parsing/validation, the router 'upgrade migrations' would be applied automatically to generate a valid runtime representation of a config for the life of the executing process.

Automatic configuration upgrades can still be applied explicitly.

Upgrade step: use the router config commands as shown at the top of the upgrade guide.

Configuration changes

The following describes changes to router configuration, including renamed options and changed default values.

Renamed metrics

Various metrics in router 2.x have been renamed to conform to the OpenTelemetry convention of using . as the namespace separator, instead of _.

Update step: use the updated names for the following metrics:

Previous metricRenamed metric
apollo_router_opened_subscriptionsapollo.router.opened.subscriptions
apollo_router_cache_hit_timeapollo.router.cache.hit.time
apollo_router_cache_sizeapollo.router.cache.size
apollo_router_cache_miss_timeapollo.router.cache.miss.time
apollo_router_state_change_totalapollo.router.state.change.total
apollo_router_span_lru_sizeapollo.router.exporter.span.lru.size *
apollo_router_session_count_activeapollo.router.session.count.active
apollo_router_uplink_fetch_count_totalapollo.router.uplink.fetch.count.total
apollo_router_uplink_fetch_duration_secondsapollo.router.uplink.fetch.duration.seconds
note
* apollo.router.exporter.span.lru.size now also has an additional exporter prefix.

Changed trace default

In router v2.x, the trace telemetry.instrumentation.spans.mode has a default value of spec_compliant. Previously in router 1.x, its default value was deprecated.

Changed defaults of GraphOS reporting metrics

Default values of some GraphOS reporting metrics have been changed from v1.x to the following in v2.x:

  • telemetry.apollo.signature_normalization_algorithm now defaults to enhanced. (In v1.x the default is legacy.)

  • telemetry.apollo.metrics_reference_mode now defaults to extended. (In v1.x the default is standard.)

Renamed configuration for Apollo operation usage reporting via OTLP

The router supports reporting operation usage metrics to GraphOS via OpenTelemetry Protocol (OTLP).

Prior to version 1.49.0 of the router, all GraphOS reporting was performed using a private tracing format. In v1.49.0, we introduced support for using OTel to perform this reporting. In v1.x, this is controlled using the otlp_tracing_sampler (or experimental_otlp_tracing_sampler prior to v1.61) flag, and it's off by default.

Now in v2.x, this flag is renamed to otlp_tracing_sampler, and it's enabled by default.

Upgrade step: in your router config, replace uses of experimental_otlp_tracing_sampler to otlp_tracing_sampler.

Learn more about configuring usage reporting via OTLP.

Renamed context keys

The router request context is used to share data across stages of the request pipeline. The keys have been renamed to prevent conflicts and to better indicate which pipeline stage or plugin populates the data.

note
You can continue using deprecated context key names from router 1.x if you configure context: deprecated in your router. For details, see Context configuration.

Upgrade step: if you access context entries in a custom plugin, Rhai script, coprocessor, or telemetry selector, you can update your context keys to account for the new names:

Previous context key nameNew context key name
apollo_authentication::JWT::claimsapollo::authentication::jwt_claims
apollo_authorization::authenticated::requiredapollo::authorization::authentication_required
apollo_authorization::scopes::requiredapollo::authorization::required_scopes
apollo_authorization::policies::requiredapollo::authorization::required_policies
apollo_operation_idapollo::supergraph::operation_id
apollo_override::unresolved_labelsapollo::progressive_override::unresolved_labels
apollo_override::labels_to_overrideapollo::progressive_override::labels_to_override
apollo_router::supergraph::first_eventapollo::supergraph::first_event
apollo_telemetry::client_nameapollo::telemetry::client_name
apollo_telemetry::client_versionapollo::telemetry::client_version
apollo_telemetry::studio::excludeapollo::telemetry::studio_exclude
apollo_telemetry::subgraph_ftv1apollo::telemetry::subgraph_ftv1
cost.actualapollo::demand_control::actual_cost
cost.estimatedapollo::demand_control::estimated_cost
cost.resultapollo::demand_control::result
cost.strategyapollo::demand_control::strategy
experimental::expose_query_plan.enabledapollo::expose_query_plan::enabled
experimental::expose_query_plan.formatted_planapollo::expose_query_plan::formatted_plan
experimental::expose_query_plan.planapollo::expose_query_plan::plan
operation_kindapollo::supergraph::operation_kind
operation_nameapollo::supergraph::operation_name
persisted_query_hitapollo::apq::cache_hit
persisted_query_registerapollo::apq::registered

Context Keys for Coprocessors

The context key renames may impact your coprocessor logic. It can be tricky to update all context key usage together with the router upgrade. To aid this, the context option for Coprocessors has been extended.

You can specify context: deprecated to send all context with the old names, compatible with v1.x. Context keys are translated to their v1.x names before being sent to the coprocessor, and translated back to the v2.x names after being received from the coprocessor.

You can now also specify exactly which context keys you wish to send to a coprocessor by listing them under the selective key. This will reduce the size of the request/response and may improve performance.

Upgrade step: Either upgrade your coprocessor to use the new context keys, or add context: deprecated to your coprocessor configuration.

Example:

YAML
1coprocessor:
2  url: http://127.0.0.1:3000 # mandatory URL which is the address of the coprocessor
3  router:
4    request:
5      context: false # Do not send any context entries
6  supergraph:
7    request:
8      headers: true
9      context: # Send only these 2 context keys to your coprocessor
10        selective:
11          - apollo::supergraph::operation_name
12          - apollo::demand_control::actual_cost
13      body: true
14    response:
15      headers: true
16      context: all # Send all context keys with new names (2.x version)
17      body: true
18  subgraph:
19    all:
20      request:
21        context: deprecated # Send all the context keys with deprecated names (1.x version)
note
The selective context keys feature can not be used together with deprecated names.

Updated syntax for configuring supergraph endpoint path

The syntax for configuring the router to receive GraphQL requests at a specific URL path has been updated:

  • The syntax for named parameters was changed from a colon to braces:

YAML
1supergraph:
2  # Previously:
3  # path: /foo/:bar/baz
4  path: /foo/{bar}/baz
  • The syntax for wildcards was changed to require braces and a name:

YAML
1supergraph:
2  # Previously:
3  # path: /foo/*
4  path: /foo/{*rest}
note
No syntax changes are required when using the default endpoint path or a path without wildcards.

Changed syntax for header propagation path

In router v2.x, the path used for selecting data from a client request body for header propagation must comply with the JSONPath spec. This means a $ is now required to select the root element.

Upgrade step: in your router config, prefix your paths with a $ when selecting root elements. For example:

YAML
1headers:
2  all:
3    request:
4      - insert:
5          name: from_app_name
6          # Previously:
7          # path: .extensions.metadata[0].app_name
8          path: $.extensions.metadata[0].app_name

Functionality changes

Updated tower service pipeline

In router v1.x, a brand new tower::Service pipeline was built for every request, so Rust plugin hooks were called for every request. Now in router v2.x, the tower::Service pipeline is built once and cloned for every request.

Upgrade step: carefully audit how your Rust plugins store state in any tower services you add to the pipeline, because the tower service is now cloned for every request.

New capabilities

The following lists new capabilities in router v2.x that we recommend you use. These capabilities don't introduce breaking changes.

More granular logging with custom telemetry

Previously, router v1.x had an experimental experimental_when_header feature to log requests and responses if a request header was set to a specific value. This feature provided very limited control:

YAML
router.previous.yaml
1telemetry:
2  exporters:
3    logging:
4      # If one of these headers matches we will log supergraph and subgraphs requests/responses
5      experimental_when_header: # NO LONGER SUPPORTED
6        - name: apollo-router-log-request
7          value: my_client
8          headers: true # default: false
9          body: true # default: false

In router v2.x, you can achieve much more granular logging using custom telemetry. The example below logs requests and responses at every stage of the request pipeline:

YAML
router.yaml
1telemetry:
2  instrumentation:
3    events:
4      router:
5        request: # Display router request log
6          level: info
7          condition:
8            eq:
9              - request_header: apollo-router-log-request
10              - my_client
11        response: # Display router response log
12          level: info
13          condition:
14            eq:
15              - request_header: apollo-router-log-request
16              - my_client
17      supergraph:
18        request: # Display supergraph request log
19          level: info
20          condition:
21            eq:
22              - request_header: apollo-router-log-request
23              - my_client
24        response:
25          level: info
26          condition:
27            eq:
28              - request_header: apollo-router-log-request
29              - my_client
30      subgraph:
31        request: # Display subgraph request log
32          level: info
33          condition:
34            eq:
35              - supergraph_request_header: apollo-router-log-request
36              - my_client
37        response: # Display subgraph response log
38          level: info
39          condition:
40            eq:
41              - supergraph_request_header: apollo-router-log-request
42              - my_client

Improved traffic shaping

Traffic shaping has been improved significantly in router v2.x. We've added a new mechanism, concurrency control, and we've improved the router's ability to observe timeout and traffic shaping restrictions correctly. These improvements do mean that clients of the router may see an increase in errors as traffic shaping constraints are enforced:

We recommend that users experiment with their configuration in order to arrive at the right combination of timeout, concurrency and rate limit controls for their particular use case.

For more information about configuring traffic shaping.

Enforce introspection depth limit

To protect against abusive requests, the router enforces a depth limit on introspection queries by default.

Because the schema-introspection schema is recursive, a client can query fields of the types of some other fields at unbounded nesting levels, and this can produce responses that grow much faster than the size of the request. Consequently, the router by default refuses to execute introspection queries that nest list fields too deep and instead returns an error.

note
  • The criteria matches MaxIntrospectionDepthRule in graphql-js, but may change in future versions.
  • In rare cases where the router rejects legitimate queries, you can configure the router to disable the limit by setting limits.introspection_max_depth: false. For example:
YAML
1# Do not enable introspection in production!
2supergraph:
3  introspection: true # Without this, schema introspection is entirely disabled by default
4limits:
5  introspection_max_depth: false # Defaults to true

Enforce valid CORS configuration

Previously in router v1.x, invalid values in the CORS configuration, such as malformed regexes, were ignored with an error logged.

Now in router 2.x, such invalid values in the CORS configuration prevent the router from starting up and result in errors like the following:

Text
1could not create router: CORS configuration error:

Upgrade step**: Validate your CORS configuration. For details, go to CORS configuration documentation.

Deploy your router

Make sure that you are referencing the correct router release: v2.0.0

Reporting upgrade issues

If you encounter an upgrade issue that isn't resolved by this article, please search for existing Apollo Community posts and start a new post if you don't find what you're looking for.

Feedback

Edit on GitHub

Forums