Deploying with managed federation
Best practices
When rolling out changes to a subgraph, we recommend the following workflow:
Confirm the backward compatibility of each change by running
rover subgraph check
in your CI pipeline.Merge backward compatible changes that successfully pass schema checks.
Deploy changes to the subgraph in your infrastructure.
Wait until all replicas finish deploying.
Run
rover subgraph publish
to update your managed federation configuration:
1rover subgraph publish my-supergraph@my-variant \
2 --schema ./accounts/schema.graphql \
3 --name accounts \
4 --routing-url https://my-running-subgraph.com/api
Pushing configuration updates safely
Whenever possible, you should update your subgraph configuration in a way that is backward compatible to avoid downtime. As suggested above, the best way to do this is to run rover subgraph check
before updating. You should also generally seek to minimize the number of breaking changes you make to your schemas.
Additionally, call rover subgraph publish
for a subgraph only after all replicas of that subgraph are deployed. This ensures that resolvers are in place for all operations that are executable against your graph, and operations can't attempt to access fields that do not yet exist.
In the rare case where a configuration change is not backward compatible with your gateway's query planner, you should update your registered subgraph schemas before you deploy your updated code.
You should also perform configuration updates that affect query planning prior to (and separately from) other changes. This helps avoid a scenario where the query planner generates queries that fail validation in downstream services or violate your resolvers.
Examples of this include:
Modifying
@key
,@requires
, or@provides
directivesRemoving a type implementation from an interface
In general, always exercise caution when pushing configuration changes that affect your gateway's query planner, and consider how those changes will affect your other subgraphs.
Example scenario
Let's say we define a Channel
interface in one subgraph, and we define types that implement Channel
in two other subgraphs:
1# channel subgraph
2interface Channel @key(fields: "id") {
3 id: ID!
4}
5
6# web subgraph
7type WebChannel implements Channel @key(fields: "id") {
8 id: ID!
9 webHook: String!
10}
11
12# email subgraph
13type EmailChannel implements Channel @key(fields: "id") {
14 id: ID!
15 emailAddress: String!
16}
To safely remove the EmailChannel
type from your supergraph schema:
Perform a
subgraph push
of theemail
subgraph that removes theEmailChannel
type from its schema.Deploy a new version of the subgraph that removes the
EmailChannel
type.
The first step causes the query planner to stop sending fragments ...on EmailChannel
, which would fail validation if sent to a subgraph that isn't aware of the type.
If you want to keep EmailType
but remove it from the Channel
interface, the process is similar. Instead of removing the EmailChannel
type altogether, only remove the implements Channel
addendum to the type definition. This is because the query planner expands queries to interfaces or unions into fragments on their implementing types.
For example, a query such as...
1query FindChannel($id: ID!) {
2 channel(id: $id) {
3 id
4 }
5}
...generates two queries, one to each subgraph, like so:
1# Generated by the query planner
2
3# To email subgraph
4query {
5 _entities(...) {
6 ...on EmailChannel {
7 id
8 }
9 }
10}
11
12# To web subgraph
13query {
14 _entities(...) {
15 ...on WebChannel {
16 id
17 }
18 }
19}
Currently, the gateway expands all interfaces into implementing types.
Removing a subgraph
To "de-register" a subgraph with Apollo, call rover subgraph delete
:
This action cannot be reversed!
1rover subgraph delete my-supergraph@my-variant --name accounts
The next time it starts up or polls, your gateway obtains an updated configuration that reflects the removed subgraph.
Using variants to control rollout
With managed federation, you can control which version of your graph a fleet of gateways are using. In the majority of cases, rolling over all of your gateway instances to a new schema version is safe, assuming you've used schema checks to confirm that your changes are backward compatible.
However, changes at the gateway level might involve a variety of different updates, such as migrating entity ownership from one subgraph to another. If your infrastructure requires a more advanced deployment process, you can use graph variants to manage different fleets of gateways running with different configurations.
Example
To configure a canary deployment, you might maintain two production graph variants in Apollo Studio, one named prod
and the other named prod-canary
. To deploy a change to a subgraph named launches
, you might perform the following steps:
Check the changes in
launches
against bothprod
andprod-canary
:shell1rover subgraph check my-supergraph@prod --name launches --schema ./launches/schema.graphql 2rover subgraph check my-supergraph@prod-canary --name launches --schema ./launches/schema.graphql
Deploy your changes to the
launches
subgraph in your production environment, without runningrover subgraph publish
.This ensures that your production gateway's configuration is not updated yet.
Update your
prod-canary
variant's registered schema, by running:Text1rover subgraph publish my-supergraph@prod-canary --name launches --schema ./launches/schema.graphql
If composition fails due to intermediate changes to the canary graph, the canary gateway's configuration will not be updated.
Wait for health checks to pass against the canary and confirm that operation metrics appear as expected.
After the canary is stable, roll out the changes to your production gateways:
Text1rover subgraph publish my-supergraph@prod --name=launches --schema ./launches/schema.graphql
If you associate metrics with variants as well, you can use Apollo Studio to verify a canary's performance before rolling out changes to the rest of the graph. You can also use variants to support a variety of other advanced deployment workflows, such as blue/green deployments.
Modifying query-planning logic
Treat migrations of your query-planning logic similarly to how you treat database migrations. Carefully consider the effects on downstream services as the query planner changes, and plan for "double reading" as appropriate.
Consider the following example of a Products
subgraph and a Reviews
subgraph:
1# Products subgraph
2
3type Product @key(fields: "upc") {
4 upc: ID!
5 nameLowerCase: String!
6}
7
8# Review subgraph
9
10extend type Product @key(fields: "upc") {
11 upc: ID! @external
12 reviews: [Review]! @requires(fields: "nameLowercase")
13 nameLowercase: String! @external
14}
Let's say we want to deprecate the nameLowercase
field and replace it with the name
field, like so:
1# Products subgraph
2
3type Product @key(fields: "upc") {
4 upc: ID!
5 nameLowerCase: String! @deprecated
6 name: String!
7}
8
9# Reviews subgraph
10
11extend type Product @key(fields: "upc") {
12 upc: ID! @external
13 nameLowercase: String! @external
14 name: String! @external
15 reviews: [Review]! @requires(fields: "name")
16}
To perform this migration in-place:
Modify the
Products
subgraph to add the new field. (As usual, first deploy all replicas, then userover subgraph publish
to push the new subgraph schema.)Deploy a new version of the
Reviews
subgraph with a resolver that accepts eithernameLowercase
orname
in the source object.Modify the Reviews subgraph's schema in the registry so that it
@requires(fields: "name")
.Deploy a new version of the
Reviews
subgraph with a resolver that only accepts thename
in its source object.
Alternatively, you can perform this operation with an atomic migration at the subgraph level, by modifying the subgraph's URL:
Modify the
Products
subgraph to add thename
field (as usual, first deploy all replicas, then userover subgraph publish
to push the new subgraph schema).Deploy a new set of
Reviews
replicas to a new URL that reads fromname
.Register the
Reviews
subgraph with the new URL and the schema changes above.
With this atomic strategy, the query planner resolves all outstanding requests to the old subgraph URL that relied on nameLowercase
with the old query-planning configuration, which @requires
the nameLowercase
field. All new requests are made to the new subgraph URL using the new query-planning configuration, which @requires
the name
field.
Reliability and security
Your gateway fetches its configuration by polling Apollo Uplink, an Apollo-hosted endpoint specifically for serving supergraph configs. In the event that your updated config is inaccessible due to an outage in Uplink, your gateway continues to serve its most recently fetched configuration.
If you restart a gateway instance or spin up a new instance during an Uplink outage, that instance can't fetch its configuration until Apollo resolves the outage.
The subgraph publish
lifecycle
Whenever you call rover subgraph publish
for a particular subgraph, it both updates that subgraph's registered schema and updates the gateway's managed configuration.
Because your graph is dynamically changing and multiple subgraphs might be updated simultaneously, it's possible for changes to cause composition errors, even if rover subgraph check
was successful. For this reason, updating a subgraph re-triggers composition in the cloud, ensuring that all subgraphs still compose to form a complete supergraph before updating the configuration. The workflow behind the scenes can be summed up as follows:
The subgraph schema is uploaded to Apollo and indexed.
The subgraph is updated in the registry to use its new schema.
All subgraphs are composed in the cloud to produce a new supergraph schema.
If composition fails, the command exits and emits errors.
If composition succeeds, Apollo Uplink begins serving the updated supergraph schema.
On the other side of the equation sits the gateway. The gateway regularly polls Apollo Uplink for changes to its configuration. The lifecycle of dynamic configuration updates is as follows:
The gateway polls for updates to its configuration.
On update, the gateway downloads the updated configuration, including the new supergraph schema.
The gateway uses the new supergraph schema to update its query planning logic.
The gateway continues to resolve in-flight requests with the previous configuration, while using the updated configuration for all new requests.