7. Query deduplication
5m

Overview

We've determined the main issue in our system is high traffic and latency in (especially accounts). But without deploying many more instances of our services, or equipping them with faster CPU, how do we improve performance?

In this lesson, we will:

  • Learn about deduplication and how it aids performance
  • Enable deduplication in our config file

Query deduplication

The has a lot of configuration options, and some of them can be used to improve performance without touching the . The most effective way to improve subgraph performance is to reduce the number of requests the subgraph will receive.

Rate limiting is one solution; but it should be a last resort as it increases the error rate clients will experience.

The has a smarter mechanism we can employ: query deduplication. deduplication is the process of reducing the number of identical queries (removing the duplicates) while still resolving the data requested.

In a federated , the large majority of queries will be read-only, through the _entities . And some tend to receive the same queries over and over. The uses this fact to aggregate multiple identical subgraph queries into a single , whose result the copies to all the original requests.

Let's say we have the following to the products :

query GetTopProducts {
topProducts {
id
name
}
}

Let's assume that this takes 100 milliseconds to execute in total, and our products executes queries one-by-one in order.

If three clients send this request to our one after another (with, for example, 30 milliseconds separating each request), then the products will receive three queries to execute. Let's say that products executes the first ; after 30 milliseconds it executes the second; after 30 more milliseconds it executes the third.

Without deduplication, we would see:

StageResultTotal latency
The first client sends the query at time TClient receives the response at T+100msLatency of 100ms
The second client sends the query at time T+30msClient receives the response at T+200msLatency of 170ms
The third client sends the query at time T+60msClient receives the response at T+300msLatency of 240ms

In this case, we'll see increased latency over time when multiple clients need data from a single .

With deduplication, the performs an extra step before sending a query to the : it first checks the is already executing the same as part of an earlier request. If so, the waits for the subgraph to return the results of the earlier query. The router then copies the results to the subsequent query and returns them.

With this process, we would instead see:

StageResultTotal latency
The first client sends the query at time TClient receives the response at T+100msLatency of 100ms
The second client sends the query at time T+30msClient receives the response at T+100msLatency of 70ms
The third client sends the query at time T+60msClient receives the response at T+100msLatency of 40ms

Latency decreases, and more importantly, our service received just one request.

Enabling query deduplication

We'll start by activating deduplication for our accounts . Hop back over into your router.yaml file.

We'll continue working under the traffic_shaping property in our configuration. Under the config we've added for accounts, add a new property called deduplicate_query at the same level of indentation as global_rate_limit. We'll set this property to true.

router.yaml
traffic_shaping:
router:
timeout: 1s
global_rate_limit:
capacity: 800
interval: 1s
all:
timeout: 500ms
subgraphs:
accounts:
timeout: 100ms
global_rate_limit:
capacity: 500
interval: 1s
deduplicate_query: true

We now see the same effect as when we added the timeout on the accounts : latency increases for other subgraphs (and client request latency increases too).

Effect of query deduplication on subgraph latency in inventory and reviews

But now, all of the requests to accounts succeed with status 200.

Effect of query deduplication on subgraph requests to accounts

Maybe we could activate it on more ? Let's apply it to products too.

traffic_shaping:
router:
timeout: 1s
global_rate_limit:
capacity: 800
interval: 1s
all:
timeout: 500ms
subgraphs:
accounts:
timeout: 100ms
global_rate_limit:
capacity: 500
interval: 1s
deduplicate_query: true
products:
deduplicate_query: true

We see p50, p75 and p90 latencies go down for requests to the products , but also for reviews and inventory.

Effect of query deduplication on subgraph requests to products

This is looking much more stable. The highest latency for accounts and products is still around 1s, though. Some requests can still take a long time, even with deduplication.

Practice

Which of the following is the most effective way to improve subgraph performance?
Which of the following are steps that occur as part of query deduplication?

Key takeaways

  • Another way of reducing the number of requests a receives is to deduplicate identical requests.
  • By deduplicating requests, the can send a single to the and copy its response to the duplicate queries.

Up next

Deduplicating identical requests has helped a lot, but there's more to the story when a reaches the . Next up, we'll talk about the process and where we can make some tweaks.

Previous

Share your questions and comments about this lesson

This course is currently in

beta
. Your feedback helps us improve! If you're stuck or confused, let us know and we'll help you out. All comments are public and must follow the Apollo Code of Conduct. Note that comments that have been resolved or addressed may be removed.

You'll need a GitHub account to post below. Don't have one? Post in our Odyssey forum instead.