Demand Control
Protect your graph from high-cost GraphQL operations
What is demand control?
Demand control provides a way to secure your supergraph from overly complex operations, based on the IBM GraphQL Cost Directive specification.
Application clients can send overly costly operations that overload your supergraph infrastructure. These operations may be costly due to their complexity and/or their need for expensive resolvers. In either case, demand control can help you protect your infrastructure from these expensive operations. When your router receives a request, it calculates a cost for that operation. If the cost is greater than your configured maximum, the operation is rejected.
Calculating cost
When calculating the cost of an operation, the router sums the costs of the sub-requests that it plans to send to your subgraphs.
For each operation, the cost is the sum of its base cost plus the costs of its fields.
For each field, the cost is defined recursively as its own base cost plus the cost of its selections. In the IBM specification, this is called field cost.
The cost of each operation type:
Mutation | Query | Subscription | |
---|---|---|---|
type | 10 | 0 | 0 |
The cost of each GraphQL element type, per operation type:
Mutation | Query | Subscription | |
---|---|---|---|
Object | 1 | 1 | 1 |
Interface | 1 | 1 | 1 |
Union | 1 | 1 | 1 |
Scalar | 0 | 0 | 0 |
Enum | 0 | 0 | 0 |
Using these defaults, the following operation would have a cost of 4.
1query BookQuery {
2 book(id: 1) {
3 title
4 author {
5 name
6 }
7 publisher {
8 name
9 address {
10 zipCode
11 }
12 }
13 }
14}
Example query's cost calculation
1 Query (0) + 1 book object (1) + 1 author object (1) + 1 publisher object (1) + 1 address object (1) = 4 total cost
Customizing cost
Since version 1.53.0, the router supports customizing the cost calculation with the @cost
directive. The @cost
directive has a single argument, weight
, which overrides the default weights from the table above.
@cost
directive differs from the IBM specification in that the weight
argument is of type Int!
instead of String!
.Annotating your schema with the @cost
directive customizes how the router scores operations. For example, imagine that the Address
resolver for an example query is particularly expensive. We can annotate the schema with the @cost
directive with a larger weight:
1type Query {
2 book(id: ID): Book
3}
4
5type Book {
6 title: String
7 author: Author
8 publisher: Publisher
9}
10
11type Author {
12 name: String
13}
14
15type Publisher {
16 name: String
17 address: Address
18}
19
20type Address
21 @cost(weight: 5) {
22 zipCode: Int!
23}
This increases the cost of BookQuery
from 4 to 8.
Example query's updated cost calculation
1 Query (0) + 1 book object (1) + 1 author object (1) + 1 publisher object (1) + 1 address object (5) = 8 total cost
Handling list fields
During the static analysis phase of demand control, the router doesn't know the size of the list fields in a given query. It must use estimates for list sizes. The closer the estimated list size is to the actual list size for a field, the closer the estimated cost will be to the actual cost.
There are two ways to indicate the expected list sizes to the router:
Set the global maximum in your router configuration file (see Configuring demand control).
Use the Apollo Federation @listSize directive.
The @listSize
directive supports field-level granularity in setting list size. By using its assumedSize
argument, you can set a statically defined list size for a field. If you are using paging parameters which control the size of the list, use the slicingArguments
argument.
Continuing with our example above, let's add two queryable fields. First, we will add a field which returns the top five best selling books:
1type Query {
2 book(id: ID): Book
3 bestsellers: [Book] @listSize(assumedSize: 5)
4}
With this schema, the following query has a cost of 40:
1query BestsellersQuery {
2 bestsellers {
3 title
4 author {
5 name
6 }
7 publisher {
8 name
9 address {
10 zipCode
11 }
12 }
13 }
14}
Cost of bestsellers query
1 Query (0) + 5 book objects (5 * (1 book object (1) + 1 author object (1) + 1 publisher object (1) + 1 address object (5))) = 40 total cost
The second field we will add is a paginated resolver. It returns the latest additions to the inventory:
1type Query {
2 book(id: ID): Book
3 bestsellers: [Book] @listSize(assumedSize: 5)
4 newestAdditions(after: ID, limit: Int!): [Book]
5 @listSize(slicingArguments: ["limit"])
6}
The number of books returned by this resolver is determined by the limit
argument.
1query NewestAdditions {
2 newestAdditions(limit: 3) {
3 title
4 author {
5 name
6 }
7 publisher {
8 name
9 address {
10 zipCode
11 }
12 }
13 }
14}
The router will estimate the cost of this query as 24. If the limit was increased to 7, then the cost would increase to 56.
When requesting 3 books:
1 Query (0) + 3 book objects (3 * (1 book object (1) + 1 author object (1) + 1 publisher object (1) + 1 address object (5))) = 24 total cost
When requesting 7 books:
1 Query (0) + 3 book objects (7 * (1 book object (1) + 1 author object (1) + 1 publisher object (1) + 1 address object (5))) = 56 total cost
Configuring demand control
To enable demand control in the router, configure the demand_control
option in router.yaml
:
1demand_control:
2 enabled: true
3 mode: measure
4 strategy:
5 static_estimated:
6 list_size: 10
7 max: 1000
When demand_control
is enabled, the router measures the cost of each operation and can enforce operation cost limits, based on additional configuration.
Customize demand_control
with the following settings:
Option | Valid values | Default value | Description |
---|---|---|---|
enabled | boolean | false | Set true to measure operation costs or enforce operation cost limits. |
mode | measure , enforce | -- | - measure collects information about the cost of operations.- enforce rejects operations exceeding configured cost limits |
strategy | static_estimated | -- | static_estimated estimates the cost of an operation before it is sent to a subgraph |
static_estimated.list_size | integer | -- | The assumed maximum size of a list for fields that return lists. |
static_estimated.max | integer | -- | The maximum cost of an accepted operation. An operation with a higher cost than this is rejected. |
When enabling demand_control
for the first time, set it to measure
mode. This will allow you to observe the cost of your operations before setting your maximum cost.
Telemetry for demand control
You can define router telemetry to gather cost information and gain insights into the cost of operations sent to your router:
Generate histograms of operation costs by operation name, where the estimated cost is greater than an arbitrary value.
Attach cost information to spans.
Generate log messages whenever the cost delta between estimated and actual is greater than an arbitrary value.
Instruments
Instrument | Description |
---|---|
cost.actual | The actual cost of an operation, measured after execution. |
cost.estimated | The estimated cost of an operation before execution. |
cost.delta | The difference between the actual and estimated cost. |
Attributes
Attributes for cost
can be applied to instruments, spans, and events—anywhere supergraph
attributes are used.
Attribute | Value | Description |
---|---|---|
cost.actual | boolean | The actual cost of an operation, measured after execution. |
cost.estimated | boolean | The estimated cost of an operation before execution. |
cost.delta | boolean | The difference between the actual and estimated cost. |
cost.result | boolean | The return code of the cost calculation. COST_OK or an error code |
Selectors
Selectors for cost
can be applied to instruments, spans, and events—anywhere supergraph
attributes are used.
Key | Value | Default | Description |
---|---|---|---|
cost | estimated , actual , delta , result | The estimated, actual, or delta cost values, or the result string |
Examples
Example instrument
Enable a cost.estimated
instrument with the cost.result
attribute:
1telemetry:
2 instrumentation:
3 instruments:
4 supergraph:
5 cost.estimated:
6 attributes:
7 cost.result: true
8 graphql.operation.name: true
Example span
Enable the cost.estimated
attribute on supergraph
spans:
1telemetry:
2 instrumentation:
3 spans:
4 supergraph:
5 attributes:
6 cost.estimated: true
Example event
Log an error when cost.delta
is greater than 1000:
1telemetry:
2 instrumentation:
3 events:
4 supergraph:
5 COST_DELTA_TOO_HIGH:
6 message: "cost delta high"
7 on: event_response
8 level: error
9 condition:
10 gt:
11 - cost: delta
12 - 1000
13 attributes:
14 graphql.operation.name: true
15 cost.delta: true
Filtering by cost result
In router telemetry, you can customize instruments that filter their output based on cost results.
For example, you can record the estimated cost when cost.result
is COST_ESTIMATED_TOO_EXPENSIVE
:
1telemetry:
2 instrumentation:
3 instruments:
4 supergraph:
5 # custom instrument
6 cost.rejected.operations:
7 type: histogram
8 value:
9 # Estimated cost is used to populate the histogram
10 cost: estimated
11 description: "Estimated cost per rejected operation."
12 unit: delta
13 condition:
14 eq:
15 # Only show rejected operations.
16 - cost: result
17 - "COST_ESTIMATED_TOO_EXPENSIVE"
18 attributes:
19 graphql.operation.name: true # Graphql operation name is added as an attribute
Configuring instrument output
When analyzing the costs of operations, if your histograms are not granular enough or don't cover a sufficient range, you can modify the views in your telemetry configuration:
1telemetry:
2 exporters:
3 metrics:
4 common:
5 views:
6 # Define a custom view because cost is different than the default latency-oriented view of OpenTelemetry
7 - name: cost.*
8 aggregation:
9 histogram:
10 buckets:
11 - 0
12 - 10
13 - 100
14 - 1000
15 - 10000
16 - 100000
17 - 1000000
Example histogram of operation costs from a Prometheus endpoint
# TYPE cost_actual histogram
cost_actual_bucket{otel_scope_name="apollo/router",le="0"} 0
cost_actual_bucket{otel_scope_name="apollo/router",le="10"} 3
cost_actual_bucket{otel_scope_name="apollo/router",le="100"} 5
cost_actual_bucket{otel_scope_name="apollo/router",le="1000"} 11
cost_actual_bucket{otel_scope_name="apollo/router",le="10000"} 19
cost_actual_bucket{otel_scope_name="apollo/router",le="100000"} 20
cost_actual_bucket{otel_scope_name="apollo/router",le="1000000"} 20
cost_actual_bucket{otel_scope_name="apollo/router",le="+Inf"} 20
cost_actual_sum{otel_scope_name="apollo/router"} 1097
cost_actual_count{otel_scope_name="apollo/router"} 20
# TYPE cost_delta histogram
cost_delta_bucket{otel_scope_name="apollo/router",le="0"} 0
cost_delta_bucket{otel_scope_name="apollo/router",le="10"} 2
cost_delta_bucket{otel_scope_name="apollo/router",le="100"} 9
cost_delta_bucket{otel_scope_name="apollo/router",le="1000"} 7
cost_delta_bucket{otel_scope_name="apollo/router",le="10000"} 19
cost_delta_bucket{otel_scope_name="apollo/router",le="100000"} 20
cost_delta_bucket{otel_scope_name="apollo/router",le="1000000"} 20
cost_delta_bucket{otel_scope_name="apollo/router",le="+Inf"} 20
cost_delta_sum{otel_scope_name="apollo/router"} 21934
cost_delta_count{otel_scope_name="apollo/router"} 1
# TYPE cost_estimated histogram
cost_estimated_bucket{cost_result="COST_OK",otel_scope_name="apollo/router",le="0"} 0
cost_estimated_bucket{cost_result="COST_OK",otel_scope_name="apollo/router",le="10"} 5
cost_estimated_bucket{cost_result="COST_OK",otel_scope_name="apollo/router",le="100"} 5
cost_estimated_bucket{cost_result="COST_OK",otel_scope_name="apollo/router",le="1000"} 9
cost_estimated_bucket{cost_result="COST_OK",otel_scope_name="apollo/router",le="10000"} 11
cost_estimated_bucket{cost_result="COST_OK",otel_scope_name="apollo/router",le="100000"} 20
cost_estimated_bucket{cost_result="COST_OK",otel_scope_name="apollo/router",le="1000000"} 20
cost_estimated_bucket{cost_result="COST_OK",otel_scope_name="apollo/router",le="+Inf"} 20
cost_estimated_sum{cost_result="COST_OK",otel_scope_name="apollo/router"}
cost_estimated_count{cost_result="COST_OK",otel_scope_name="apollo/router"} 20
An example chart of a histogram:
You can also chart the percentage of operations that would be allowed or rejected with the current configuration: