PostHole
Compose Login
You are browsing eu.zone1 in read-only mode. Log in to participate.
rss-bridge 2025-09-17T17:01:58+00:00

Viaduct, Five Years On: Modernizing the Data-Oriented Service Mesh


Viaduct, Five Years On: Modernizing the Data-Oriented Service Mesh

A more powerful engine and a simpler API for our data-oriented mesh

Adam Miskiewicz

9 min readSep 15, 2025

By: Adam Miskiewicz, Raymie Stata

In November 2020 we published a post about Viaduct, our data-oriented service mesh. Today, we’re excited to announce Viaduct is available as open-source software (OSS) at https://github.com/airbnb/viaduct.

Before we talk about OSS, here’s a quick update on Viaduct’s adoption and evolution at Airbnb over the last five years. Since 2020, traffic through Viaduct has grown by a factor of eight. The number of teams hosting code in Viaduct has doubled to 130+ (with hundreds of weekly active developers). The codebase hosted by Viaduct has tripled to over 1.5M lines (plus about the same in test code). We’ve achieved all this while keeping operational overhead constant, halving incident-minutes, and keeping costs growing linearly with QPS.

What’s the same?

Three principles have guided Viaduct since day one and still anchor the project: a central schema served by hosted business logic via a re-entrant API.

**Central schema
**Viaduct serves our central schema: a single, integrated schema connecting all of our domains across the company. While that schema is developed in a *decentralized* manner by many teams, it’s one, highly connected graph. Over 75% of Viaduct requests are internal because Viaduct has become a “one‑stop” data-oriented mesh connecting developers to all of our data and capabilities.

**Hosted business logic
**From the beginning, we’ve encouraged teams to host their business logic directly in Viaduct. This runs counter to what many consider to be best practices in GraphQL, which is that GraphQL servers should be a thin layer over microservices that host the real business logic. We’ve created a serverless platform for hosting business logic, allowing our developers to focus on writing business logic rather than on operational issues. As noted by Katie, an engineer on our Media team:

“As we migrate our media APIs into Viaduct, we’re looking forward to retiring a handful of standalone services. Centralizing everything means less overhead, fewer moving parts, and a much smoother developer experience!”

Re-entrancy

At the heart of our developer experience is what we call re-entrancy: Logic hosted on Viaduct composes with other logic hosted on Viaduct by issuing GraphQL fragments and queries. Re-entrancy has been crucial for maintaining modularity in a large codebase and avoiding classic monolith hazards.

What’s changed?

For most of Viaduct’s history, evolution has been bottom-up and reactive to immediate developer needs. We added capabilities incrementally, which helped us move fast, but also produced multiple ways to accomplish similar tasks (some well‑supported, others not) and created a confusing developer experience, especially for new teams. Another side-effect of this reactive approach has been a lack of architectural integrity. The interfaces between the layers of Viaduct, described in more detail below, are loose and often arbitrary, and the abstraction boundary between the Viaduct framework and the code that it hosts is weak. As a result, it has become increasingly difficult to make changes to Viaduct without disrupting our customer base.

To address these issues, over a year ago we launched a major initiative we call “Viaduct Modern”, a ground-up overhaul of both the developer-facing API and the execution engine.

Tenant API

One driving principle of Viaduct Modern has been to simplify and rationalize the API we provide to developers in Viaduct, which we call the “Tenant API”. The following diagram captures the decision tree one faced when deciding how to implement functionality in the old API:

Press enter or click to view image in full size

Viaduct’s original complex programming model

Each oval in this diagram represents a different mechanism for writing code. In contrast, the new API offers just two mechanisms: node resolvers and field resolvers.

Press enter or click to view image in full size

Viaduct Modern’s simpler model

The choice between the two is driven by the schema itself, not ad‑hoc distinctions based on a feature’s behavior. We unified the APIs for both resolver types wherever possible, which simplifies dev experience. After four years evolving the API in a use‑case‑driven manner, we distilled the best ideas into a single simple surface (and left the mistakes behind).

Tenant modularity

Strong abstraction boundaries are essential in any large codebase. Microservices achieve this via service definitions and RPC API boundaries; Viaduct achieves it via modules plus re‑entrancy.

Modularity in the central schema and hosted code has evolved. Initially, all we had was a vague set of conventions for organizing code into team-owned directories. There was no formal concept of a module, and schema and code were kept in separate source directories with unenforced naming conventions to connect the two. Over time, we evolved that into a more formal abstraction we call a “tenant module.” A tenant module is a unit of schema together with the code that implements that schema, and crucially, is owned by a single team. While we encourage rich graph‑level connections across modules, we discourage direct code dependencies between modules. Instead, modules compose via GraphQL fragments and queries. Viaduct Modern extends and simplifies these re‑entrancy tools.

Let’s look at an example. Imagine two teams, a “Core User” team that owns and manages the basic profile data of users, and then a “Messaging” team that operates a messaging platform for users to interact with each other. In our example, the Messaging team would like to define a displayName field on a User, which is used in their user interface. This would look something like this:

Core User team

type User implements Node {
id: ID!
firstName: String
lastName: String
…
class UserResolver : Nodes.User() {
@Inject
val userClient: UserServiceClient

@Inject
val userResponseMapper: UserResponseMapper

override suspend fun resolve(ctx.Context): User {
val r = userClient.fetch(ctx.id)
return userResponseMapper(r)

This is the base definition of the User type that lives in the Core User team’s module. This base definition defines the first- and last-name fields (among many others), and it’s the Core User team’s responsibility to materialize those fields.

Messaging team

extend type User {
displayName: String @resolver
@Resolver("firstName lastName")
class DisplayNameResolver : UserResolvers.DisplayName() {
override suspend fun resolve(ctx: Context): String {
val f = ctx.objectValue.getFirstName()
val l = ctx.objectValue.getLastName()
return "$f ${l.first()}."

The Messaging team can then extend the User type with the display name field, and also indicates that they intend to provide a resolver for it. The code has an @Resolver annotation that indicates which fields of the User object it needs to implement the displayName field. The Messaging team doesn’t need to understand which module these fields come from, and their code doesn’t depend on code from the Core User team. Instead, the Messaging team states their data needs in a declarative fashion.

Framework modularity

[...]


*Original source*

Reply