Query deduplication

7. Query deduplication

Overview

We've determined the main issue in our system is high traffic and latency in subgraphs (especially accounts). But without deploying many more instances of our services, or equipping them with faster CPU, how do we improve performance?

In this lesson, we will:

Learn about query deduplication and how it aids performance
Enable query deduplication in our config file

Query deduplication

The router has a lot of configuration options, and some of them can be used to improve performance without touching the subgraphs. The most effective way to improve subgraph performance is to reduce the number of requests the subgraph will receive.

Rate limiting is one solution; but it should be a last resort as it increases the error rate clients will experience.

The router has a smarter mechanism we can employ: query deduplication. Query deduplication is the process of reducing the number of identical queries (removing the duplicates) while still resolving the data requested.

In a federated graph, the large majority of subgraph queries will be read-only, through the _entities query. And some subgraphs tend to receive the same queries over and over. The router uses this fact to aggregate multiple identical subgraph queries into a single query, whose result the router copies to all the original requests.

Let's say we have the following subgraph query to the products subgraph:

query GetTopProducts {
  topProducts {
    id
    name
  }
}

Let's assume that this query takes 100 milliseconds to execute in total, and our products subgraph executes queries one-by-one in order.

If three clients send this request to our router one after another (with, for example, 30 milliseconds separating each request), then the products subgraph will receive three queries to execute. Let's say that products executes the first query; after 30 milliseconds it executes the second; after 30 more milliseconds it executes the third.

Without query deduplication, we would see:

Stage	Result	Total latency
The first client sends the query at time T	Client receives the response at T+100ms	Latency of 100ms
The second client sends the query at time T+30ms	Client receives the response at T+200ms	Latency of 170ms
The third client sends the query at time T+60ms	Client receives the response at T+300ms	Latency of 240ms

In this case, we'll see increased latency over time when multiple clients need data from a single subgraph.

With query deduplication, the router performs an extra step before sending a query to the subgraph: it first checks the subgraph is already executing the same query as part of an earlier request. If so, the router waits for the subgraph to return the results of the earlier query. The router then copies the results to the subsequent query and returns them.

With this process, we would instead see:

Stage	Result	Total latency
The first client sends the query at time T	Client receives the response at T+100ms	Latency of 100ms
The second client sends the query at time T+30ms	Client receives the response at T+100ms	Latency of 70ms
The third client sends the query at time T+60ms	Client receives the response at T+100ms	Latency of 40ms

Latency decreases, and more importantly, our subgraph service received just one request.

Enabling query deduplication

We'll start by activating query deduplication for our accounts subgraph. Hop back over into your router.yaml file.

We'll continue working under the traffic_shaping property in our configuration. Under the config we've added for accounts, add a new property called deduplicate_query at the same level of indentation as global_rate_limit. We'll set this property to true.

router.yaml

traffic_shaping:
  router:
    timeout: 1s
    global_rate_limit:
      capacity: 800
      interval: 1s
  all:
    timeout: 500ms
  subgraphs:
    accounts:
      timeout: 100ms
      global_rate_limit:
        capacity: 500
        interval: 1s
      deduplicate_query: true

We now see the same effect as when we added the timeout on the accounts subgraph: latency increases for other subgraphs (and client request latency increases too).

Effect of query deduplication on subgraph latency in inventory and reviews

But now, all of the requests to accounts succeed with status 200.

Effect of query deduplication on subgraph requests to accounts

Maybe we could activate it on more subgraphs? Let's apply it to products too.

traffic_shaping:
  router:
    timeout: 1s
    global_rate_limit:
      capacity: 800
      interval: 1s
  all:
    timeout: 500ms
  subgraphs:
    accounts:
      timeout: 100ms
      global_rate_limit:
        capacity: 500
        interval: 1s
      deduplicate_query: true
    products:
      deduplicate_query: true

We see p50, p75 and p90 latencies go down for requests to the products subgraph, but also for reviews and inventory.

Effect of query deduplication on subgraph requests to products

This is looking much more stable. The highest latency for accounts and products is still around 1s, though. Some subgraph requests can still take a long time, even with deduplication.

Practice

Which of the following is the most effective way to improve subgraph performance?

Reducing the number of requests a subgraph will receive.Configuring the router to run faster.Hiring more engineers.Taking the subgraph offline.

Which of the following are steps that occur as part of query deduplication?

The router checks to see if a query it received is already being executed as part of an earlier request.The router copies the same results to each of the original identical queries.The router handles each of the identical queries in order, but only returns a single response.The router aggregates identical queries into a single query.

Key takeaways

Another way of reducing the number of requests a subgraph receives is to deduplicate identical requests.
By deduplicating requests, the router can send a single query to the subgraph and copy its response to the duplicate queries.

Up next

Deduplicating identical requests has helped a lot, but there's more to the story when a query reaches the router. Next up, we'll talk about the query planning process and where we can make some tweaks.

Share your questions and comments about this lesson

This course is currently in

beta

. Your feedback helps us improve! If you're stuck or confused, let us know and we'll help you out. All comments are public and must follow the Apollo Code of Conduct. Note that comments that have been resolved or addressed may be removed.

You'll need a GitHub account to post below. Don't have one? Post in our Odyssey forum instead.