Reference Architecture

Enterprise deployment of GraphOS

In this guide, learn about the fundamental concepts and configuration underlying Apollo's reference architecture for enterprise deployment of GraphOS with a self-hosted GraphOS Router. Use this overview as a companion reference to the Apollo reference architecture repository.

About Apollo's reference architecture

In a modern cloud-native stack, your components must be scalable with high availability. The Apollo Router Core is built with this in mind. The router is much faster and less resource-intensive than the Apollo Gateway, Apollo's original runtime.

Apollo provides a reference architecture for self-hosting the router and subgraphs in an enterprise cloud environment using Kubernetes and Helm. It demonstrates how to deploy the router and subgraphs in a cloud environment, including the use of Apollo GraphOS and the following enterprise GraphOS Router features:

Authentication/Authorization directives within the schema to secure your supergraph
External coprocessing to enable complex customizations to the router
Persisted Queries to improve performance and security through safelisting only known operations

Furthermore, the reference architecture demonstrates how to use the router with OpenTelemetry to collect and analyze performance and utilization metrics for your supergraph, as well as testing performance using k6.

tip

Check out the Deploying the Apollo Router at Apollo blog post to learn about Apollo's internal use of the router, including the performance improvements and resource utilization reductions.

Getting started

To get started with the reference architecture, follow the README in the reference repository. The README provides a how-to guide that walks you through building a supergraph with the reference architecture.

Architecture overview

The reference architecture uses two Kubernetes clusters, one for a development environment and other for a production environment. Each cluster has pods for:

Hosting the router
Hosting subgraphs
Hosting a client
Collecting traces
Load testing K6 and viewing results with Grafana

For both environments, GraphOS serves as a schema registry. Each subgraph publishes schema updates to the registry via CI/CD, and GraphOS validates and composes them into a supergraph schema. The router regularly polls an endpoint called Apollo Uplink to get the latest supergraph schema and routing configurations from GraphOS.

The router also pushes performance and utilization metrics to GraphOS via Uplink so you can analyze them in GraphOS Studio.

Development environment

The development environment consists of the router and subgraphs hosted in a Kubernetes cluster in either AWS or GCP. GraphOS validates subgraph schemas using schema checks and makes them available to the router via Uplink. The router also reports usage metrics back to GraphOS.

Production environment

The production environment is similar to the development environment with some additions.

The router and subgraphs send their OpenTelemetry data to a collector. You can then view the data in Zipkin.
A K6 load tester sends traffic to the router and stores load test results in InfluxDB for viewing in Grafana.

CI/CD

The reference architecture uses GitHub Actions for its CI/CD. These actions include:

PR-level schema checks
Building containers using Docker
Publishing subgraph schemas to Apollo Uplink
Deployments for:
- Router
- Subgraphs
- Client
- OpenTelemetry Collector
- Grafana
Running load tests using k6

Development actions

When a PR is submitted to one of the subgraphs, GitHub Actions uses GraphOS to validate schema changes using schema checks.

When the PR is merged, GitHub Actions publishes schema updates to Uplink, and GraphOS validates them using schema checks before making them available to the router. Additionally, the subgraph service is deployed.

Production deploy

When you manually trigger a production deployment, GitHub Actions publishes schema updates to Uplink and GraphOS validates them using schema checks before making them available to the router. Additionally, the subgraph service is deployed.

Deploy router

When you manually trigger a router deployment, GitHub Actions deploys the router to the Kubernetes cluster.

Deploy OpenTelemetry collector

When you manually trigger an OpenTelemetry deployment, GitHub Actions deploys the OpenTelemetry Collector and Zipkin to the Kubernetes cluster.

Deploy load test infrastructure

When you manually trigger a load test infrastructure deployment, GitHub Actions deploys the K6 Load Tester, Grafana, the load tests, and InfluxDB to the Kubernetes cluster.

Run load tests

When you manually trigger a load test run, GitHub Actions triggers the K6 Load Tester to pull the Load Tests from the environment, run the tests against the router, and store the results in InfluxDB.

Contracts

SAML

OIDC

Subgraph Reference

Development and Tooling

From Federation 1 to 2

Server-Driven UI

Subscription Support

Log Exporters

Metrics Exporters

Trace Exporters

Instrumentation

AWS Lattice