Overload Protection
Implement overload protection for high traffic scenarios
A GraphQL server can implement overload protection to help it remain available while under high load. With overload protection, a server monitors its resource usage and begins shedding incoming traffic whenever that usage approaches a performance-degrading limit (such as running out of memory).
As you add capabilities and users to your supergraph, you might introduce new usage patterns that add unexpectedly high load. Overload protection helps reduce the impact of these spikes while you optimize your supergraph to eliminate them entirely.
Example scenarios
A common source of overload in a system is the thundering herd problem, where a large number of processes or clients attempt to access limited computer resources. Many scenarios can cause this, for example:
Pod failures in Kubernetes cause a smaller amount of pods to handle the same amount of traffic.
A marketing campaign or viral social media post drives high traffic to an application in a short period.
A newly deployed feature introduces more load on the graph than expected.
A more graph-based problem is adding an entity relationship in the schema that causes a significant increase in traffic. For example in the Star Wars schema, imagine if there was no link from Person
to Film
(though PersonFilmsConnection
) and it was added today. Until the usage of that new connection in the schema flattens out, every deployment or event that causes traffic could cause a large amount of new load directly attributed to the change for the owner of the Film
entity.
Implementing in Express
Overload protection packages are available for most popular languages and server frameworks. For example, we'll look at using the overload-protection package with the @apollo/server
package. This drop-in package enables your server to return a 503 based any of the following:
The current event loop delay
The amount of bytes used by the heap
The amount of bytes used by Resident Set Size (RSS).
To use overload-protection
, you include it in your Express startup like so:
1import express from 'express';
2import protect from 'overload-protection';
3
4const app = express();
5protect('express');
6app.use(protect);
startStandaloneServer
function, you'll need to swap to expressMiddleware
before adding overload protection.If you're using @apollo/server
's Express integration (that isexpressMiddleware
), you can add overload-protection
via Express middleware by adding the highlighted lines to your server creation:
1import {ApolloServer} from '@apollo/server';
2import {expressMiddleware} from '@apollo/server/express4';
3import {ApolloServerPluginDrainHttpServer} from '@apollo/server/plugin/drainHttpServer';
4import express from 'express';
5import http from 'http';
6import cors from 'cors';
7import {typeDefs, resolvers} from './schema';
8import protect from 'overload-protection';
9protect('express');
10
11const app = express();
12app.use(protect);
13const httpServer = http.createServer(app);
14const server = new ApolloServer({
15 typeDefs,
16 resolvers,
17 plugins: [ApolloServerPluginDrainHttpServer({httpServer})]
18});
19
20// Note the top-level `await` calls below!
21await server.start();
22app.use('/graphql', cors(), express.json(), expressMiddleware(server));
23await new Promise(resolve => httpServer.listen({port: 4000}, resolve));
24console.log(`🚀 Server ready at http://localhost:4000/graphql`);
This approach also works if you're using the @apollo/subgraph
library to create your subgraphs in a similar way.
Overload protection is not specific to GraphQL, so it's best to handle it outside of Apollo software.
Protecting a supergraph
When adding overload protection to a supergraph, a reasonable question is, "Do I add protection to my gateway/router or to my individual subgraphs?" The short answer is "both":
Protecting the router protects the availability of the supergraph as a whole.
Protecting a subgraph reduces the error rate for queries that request data from that subgraph.
Gateway/router
The main concern with the gateway is a buildup of requests that cause partial or cascading failures. If the gateway can't shed excessive load, its performance starts to degrade.
A single request to the gateway usually transforms into multiple requests to subgraphs, which can increase load more than expected for complex queries. In such cases, overload protection in the gateway can save it from falling over entirely. This looks like a temporary dip in availability instead of a total outage.
Subgraphs
A failure in a subgraph can cause a backup in the gateway. If this backup is due to load, overload protection helps short-circuit the return of an error. This relieves the pressure in both the gateway and the subgraph by allowing the gateway to return errors faster.
Additional protections
To further protect against overloading your graph and system, you should enable limiting query depth and rate limiting.