Traffic Shaping
Tune the performance and reliability of traffic to and from the router
The GraphOS Router and Apollo Router Core provide various features to improve the performance and reliability of the traffic between the client and router and between the router and subgraphs.
Configuration
By default, the traffic_shaping
plugin is enabled with preset values. To override presets, add traffic_shaping
to your YAML config file like so:
1traffic_shaping:
2 router: # Rules applied to requests from clients to the router
3 global_rate_limit: # Accept a maximum of 10 requests per 5 secs. Excess requests must be rejected.
4 capacity: 10
5 interval: 5s # Must not be greater than 18_446_744_073_709_551_615 milliseconds and not less than 0 milliseconds
6 timeout: 50s # If a request to the router takes more than 50secs then cancel the request (30 sec by default)
7 all:
8 deduplicate_query: true # Enable query deduplication for all subgraphs.
9 compression: br # Enable brotli compression for all subgraphs.
10 subgraphs: # Rules applied to requests from the router to individual subgraphs
11 products:
12 deduplicate_query: false # Disable query deduplication for the products subgraph.
13 compression: gzip # Enable gzip compression only for the products subgraph.
14 global_rate_limit: # Accept a maximum of 10 requests per 5 secs from the router. Excess requests must be rejected.
15 capacity: 10
16 interval: 5s # Must not be greater than 18_446_744_073_709_551_615 milliseconds and not less than 0 milliseconds
17 timeout: 50s # If a request to the subgraph 'products' takes more than 50secs then cancel the request (30 sec by default)
18 experimental_retry:
19 min_per_sec: 10 # minimal number of retries per second (`min_per_sec`, default is 10 retries per second)
20 ttl: 10s # for each successful request, we register a token, that expires according to this option (default: 10s)
21 retry_percent: 0.2 # defines the proportion of available retries to the current number of tokens
22 retry_mutations: false # allows retries on mutations. This should only be enabled if mutations are idempotent
23 experimental_http2: enable # Configures HTTP/2 usage. Can be 'enable' (default), 'disable' or 'http2only'
Preset values
The preset values of traffic_shaping
that's enabled by default:
timeout: 30s
for all timeoutsexperimental_http2: enable
Client side traffic shaping
Rate limiting
The router can apply rate limiting on client requests, as follows:
1traffic_shaping:
2 router: # Rules applied to requests from clients to the router
3 global_rate_limit: # Accept a maximum of 10 requests per 5 secs. Excess requests must be rejected.
4 capacity: 10
5 interval: 5s # Must not be greater than 18_446_744_073_709_551_615 milliseconds and not less than 0 milliseconds
This rate limiting applies to all requests, there is no filtering per IP or other criteria.
Timeouts
The router applies a default timeout of 30 seconds for all requests, including the following:
Requests the client makes to the router
Requests the router makes to subgraphs
Initial requests subgraphs make to the router for subscription callbacks
For subscriptions callbacks, the timeout only applies to the initial request to the router. Once the subscription has been established, the request duration can exceed the timeout.
You can change the default timeout for client requests to the router like so:
1traffic_shaping:
2 router:
3 timeout: 50s # If client requests to the router take more than 50 seconds, cancel the request (30 seconds by default)
You can change the default timeout for all requests between the router and subgraphs like so:
1traffic_shaping:
2 all:
3 timeout: 50s # If subgraph requests take more than 50 seconds, cancel the request (30 seconds by default)
Compression
Compression is automatically supported on the client side, depending on the Accept-Encoding
header provided by the client.
Query batching
The router has support for receiving client query batches:
1batching:
2 enabled: true
3 mode: batch_http_link
For details, see query batching for the router.
Subgraph traffic shaping
The router supports various options affecting traffic destined for subgraphs, that can either be defined for all subgraphs, or overriden per subgraph:
1traffic_shaping:
2 all:
3 deduplicate_query: true # Enable query deduplication for all subgraphs.
4 subgraphs: # Rules applied to requests from the router to individual subgraphs
5 products:
6 deduplicate_query: false # Disable query deduplication for the products subgraph.
Compression
The router can compress request bodies to subgraphs (along with response bodies to clients).
It currently supports these algorithms: gzip
, br
, and deflate
.
1traffic_shaping:
2 all:
3 compression: gzip # Enable gzip compression for all subgraphs.
Subgraph response decompression is always supported for these algorithms: gzip
, br
, and deflate
.
br
) compression is not supported by Apollo Server, due to its underlying Express.js not supporting it out of the box. Therefore, don't configure br
compression for traffic shaping when using Apollo Server as a subgraph server with the router.Rate limiting
Subgraph request rate limiting uses the same configuration as client rate limiting, and is calculated per subgraph, not per backend host.
1traffic_shaping:
2 all:
3 global_rate_limit: # Accept a maximum of 10 requests per 5 secs. Excess requests must be rejected.
4 capacity: 10
5 interval: 5s # Must not be greater than 18_446_744_073_709_551_615 milliseconds and not less than 0 milliseconds
Experimental request retry
On failure, subgraph requests can be retried automatically. This is deactivated by default for mutations. This uses Finagle's RetryBudget algorithm, in which every successful request adds an expirable token to a bucket, and every retry consumes a number of those tokens. On top of that, a minimal number of retries per second is available, to test regularly when the retry budget was entirely consumed or on startup when very few requests have been sent. The tokens expire so the budget has a large number of available retries if a lot of recent requests were successful but reduces quickly on frequent failures to avoid sending too much traffic to the subgraph.
It is configurable as follows:
1traffic_shaping:
2 all:
3 experimental_retry:
4 min_per_sec: 10 # minimal number of retries per second (`min_per_sec`, default is 10 retries per second)
5 ttl: 10s # for each successful request, we register a token, that expires according to this option (default: 10s)
6 retry_percent: 0.2 # defines the proportion of available retries to the current number of tokens
7 retry_mutations: false # allows retries on mutations. This should only be enabled if mutations are idempotent
Variable deduplication
When subgraphs are sent entity requests by the router using the _entities
field, it is often the case that the same entity (identified by a unique @key
constraint) is requested multiple times within the execution of a single federated query. For example, an author's name might need to be fetched multiple times when accessing a list of a reviews for a product for which the author has written multiple reviews.
To reduce the size of subgraph requests and the amount of work they might perform, the list of entities sent can be deduplicated. This is always active.
Query deduplication
If the router is simultaneously processing similar queries, it may result in producing multiple identical requests to a subgraph. With the deduplicate_query
functionality enabled (by default, it is disabled), the router can avoid sending the same query multiple times and instead buffer one or more of the dependent queries pending the result of the first, and reuse that result to fulfill all of the initial queries. This will reduce the overall traffic to the subgraph and the overall client request latency. To meet the criteria for deduplication, the feature must be enabled and the subgraph queries must have have the same HTTP path, headers and body:
1traffic_shaping:
2 all:
3 deduplicate_query: true # Enable query deduplication for all subgraphs.
HTTP/2
The router supports subgraph connections over:
- HTTP/1.1
- HTTP/1.1 with TLS
- HTTP/2 with TLS
- HTTP/2 Cleartext protocol (h2c). This uses HTTP/2 over plaintext connections.
Use the table below to look up the resulting protocol of a subgraph connection, based on the subgraph URL and the experimental_http2
configuration:
URL with http:// | URL with https:// | |
---|---|---|
experimental_http2: disable | HTTP/1.1 | HTTP/1.1 with TLS |
experimental_http2: enable | HTTP/1.1 | Either HTTP/1.1 or HTTP/2 with TLS, as determined by the TLS handshake |
experimental_http2: http2only | h2c | HTTP/2 with TLS |
experimental_http2 not set | HTTP/1.1 | Either HTTP/1.1 or HTTP/2 with TLS, as determined by the TLS handshake |
Configuring experimental_http2: http2only
for a subgraph that doesn't support HTTP/2 results in a failed subgraph connection.
Ordering
Traffic shaping always executes these steps in the same order, to ensure a consistent behaviour. Declaration order in the configuration will not affect the runtime order:
preparing the subgraph request
variable deduplication
query deduplication
timeout
request retry
rate limiting
compression
sending the request to the subgraph