Overview
We've determined the main issue in our system is high traffic and latency in subgraphs (especially accounts
). But without deploying many more instances of our services, or equipping them with faster CPU, how do we improve performance?
In this lesson, we will:
- Learn about query deduplication and how it aids performance
- Enable query deduplication in our config file
Query deduplication
The router has a lot of configuration options, and some of them can be used to improve performance without touching the subgraphs. The most effective way to improve subgraph performance is to reduce the number of requests the subgraph will receive.
Rate limiting is one solution; but it should be a last resort as it increases the error rate clients will experience.
The router has a smarter mechanism we can employ: query deduplication. Query deduplication is the process of reducing the number of identical queries (removing the duplicates) while still resolving the data requested.
In a federated graph, the large majority of subgraph queries will be read-only, through the _entities
query. And some subgraphs tend to receive the same queries over and over. The router uses this fact to aggregate multiple identical subgraph queries into a single query, whose result the router copies to all the original requests.
Let's say we have the following subgraph query to the products
subgraph:
query GetTopProducts {topProducts {idname}}
Let's assume that this query takes 100 milliseconds to execute in total, and our products
subgraph executes queries one-by-one in order.
If three clients send this request to our router one after another (with, for example, 30 milliseconds separating each request), then the products
subgraph will receive three queries to execute. Let's say that products
executes the first query; after 30 milliseconds it executes the second; after 30 more milliseconds it executes the third.
Without query deduplication, we would see:
Stage | Result | Total latency |
---|---|---|
The first client sends the query at time T | Client receives the response at T+100ms | Latency of 100ms |
The second client sends the query at time T+30ms | Client receives the response at T+200ms | Latency of 170ms |
The third client sends the query at time T+60ms | Client receives the response at T+300ms | Latency of 240ms |
In this case, we'll see increased latency over time when multiple clients need data from a single subgraph.
With query deduplication, the router performs an extra step before sending a query to the subgraph: it first checks the subgraph is already executing the same query as part of an earlier request. If so, the router waits for the subgraph to return the results of the earlier query. The router then copies the results to the subsequent query and returns them.
With this process, we would instead see:
Stage | Result | Total latency |
---|---|---|
The first client sends the query at time T | Client receives the response at T+100ms | Latency of 100ms |
The second client sends the query at time T+30ms | Client receives the response at T+100ms | Latency of 70ms |
The third client sends the query at time T+60ms | Client receives the response at T+100ms | Latency of 40ms |
Latency decreases, and more importantly, our subgraph service received just one request.
Enabling query deduplication
We'll start by activating query deduplication for our accounts
subgraph. Hop back over into your router.yaml
file.
We'll continue working under the traffic_shaping
property in our configuration. Under the config we've added for accounts
, add a new property called deduplicate_query
at the same level of indentation as global_rate_limit
. We'll set this property to true
.
traffic_shaping:router:timeout: 1sglobal_rate_limit:capacity: 800interval: 1sall:timeout: 500mssubgraphs:accounts:timeout: 100msglobal_rate_limit:capacity: 500interval: 1sdeduplicate_query: true
We now see the same effect as when we added the timeout on the accounts
subgraph: latency increases for other subgraphs (and client request latency increases too).
But now, all of the requests to accounts
succeed with status 200.
Maybe we could activate it on more subgraphs? Let's apply it to products
too.
traffic_shaping:router:timeout: 1sglobal_rate_limit:capacity: 800interval: 1sall:timeout: 500mssubgraphs:accounts:timeout: 100msglobal_rate_limit:capacity: 500interval: 1sdeduplicate_query: trueproducts:deduplicate_query: true
We see p50, p75 and p90 latencies go down for requests to the products
subgraph, but also for reviews
and inventory
.
This is looking much more stable. The highest latency for accounts
and products
is still around 1s, though. Some subgraph requests can still take a long time, even with deduplication.
Practice
Key takeaways
- Another way of reducing the number of requests a subgraph receives is to deduplicate identical requests.
- By deduplicating requests, the router can send a single query to the subgraph and copy its response to the duplicate queries.
Up next
Deduplicating identical requests has helped a lot, but there's more to the story when a query reaches the router. Next up, we'll talk about the query planning process and where we can make some tweaks.
Share your questions and comments about this lesson
This course is currently in
You'll need a GitHub account to post below. Don't have one? Post in our Odyssey forum instead.