Organizations embrace micro-services and event-driven APIs in their technology platforms to try to achieve the promise of greater agility, increased innovation, and more autonomy for their development teams. However, after the initial success, it is not unusual for organizations to face difficulties when they try to scale their distributed platforms. At this point, with the increase in scale, questions about API consistency, evolution, discoverability and observability become critical, and, at the same time, increasingly difficult to answer for organizations managing API landscapes with hundreds of APIs and event streams.
Without those critical considerations, the API landscape of an organization can quickly become an entangled mess of inconsistent interfaces and hidden services prone to break that is difficult to change and evolve. The infamous micro-services “death star” diagrams from some of the pioneers in this area are a great way of visually conveying the complexity that these distributed architectures can generate.
Dealing with complexity at scale without impacting team agility is the main strategic challenge faced by architects and program managers of large API portfolios.
In this post we will describe API Federation, an architectural pattern that can help deal with these problems. We will also describe how DataGraph, a new product by MuleSoft, can help API teams to introduce API federation in their organizations using GraphQL as the technology used to expose the federated API.
So what is API Federation anyway?
In a nutshell, API Federation is the set of design principles, tools, and infrastructure that make it possible to expose a set of services and event streams within a particular bounded context as a unified and consistent API for external customers, while allowing individual services within the bounded context to evolve and change without additional restrictions.
Before diving into the details of the API Federation pattern, let’s state upfront what API Federation is not about:
API Federation is not about creating a single unified canonical model. Canonical models spanning full organizations don’t work. At best they become failed projects due to the inability of the teams to match the supposed ‘canonical’ schema designed by a centralized team alien to their day-to-day necessities and business requirements, at worst they can freeze an organization to slowly crumble due to semantic inconsistencies and inability to evolve.API Federation is not about GraphQL. To be precise, it does not require the use of GraphQL. GraphQL is a great way of querying APIs and provides a powerful mechanism to describe the schema of a service but is not necessarily the best way of designing and building all your APIs. Federation can work without GraphQL and is a perfect match for REST APIs. In fact, decentralization is at the core of web architecture, and the Linked Data movement has been exposing similar ideas for a long time.
So having clarified these points, let’s focus on describing the main aspects of the API Federation pattern.
1. Bounded context and common model
The first step towards federation is dividing your API landscape into bounded contexts.
The notion of bounded context here is taken directly from the ideas about domain driven design (DDD) introduced by Eric Evans in his seminal “Domain-Driven Design” book.
By bounded context, we designate a set of services and event streams that consistently expose a set of shared common concepts. This idea is very similar to the concept of “cluster of codesigned services” introduced by Evans when revisiting the notion of bounded context from the micro-services point of view.
Traditionally, the software modules within a bounded context are defined as sharing a “ubiquitous language” describing the shared domain model. From the point of view of API federation, we can be more explicit and capture the notion of a common model through practical rules for three key aspects of the model: semantics, identity and schema:
Common semantics: using a particular noun for a particular entity, attribute or event, like Customer or Account, means exactly the same in all services and events within the bounded context. Clashes in meaning must be resolved by renaming the ambiguous concept or introducing some namespacing mechanism.Unified identifiers: the value identifying an entity, for example, the tuple of attributes conforming a primary key for a particular entity type like Customer, is the same among all the services in the bounded context. If two services belong to the same bounded context, the same identifier, for the same entity type in two services must designate the same entity.Compatible schemas: the schema used to expose model information in the services and events of the bounded context must be consistent. This does not mean that every service must share the same schema, but inconsistencies, such as using incompatible data types for the same attribute in a model entity, are not allowed.
The goal of API federation is to expose all the resources, capabilities, and events within a bounded context as a unified API for external customers exposing a common and consistent schema that hides the complexity of the internal set of services and provides a simplified operational interface for the bounded context that can be easily productized.
Ideally, the generation of the federated schema should be completely declarative and rely only on metadata and the introspection capabilities of the source APIs. For example, in MuleSoft’s DataGraph, we consume the API contract of RESTful APIs exposed as OAS or RAML enriched with some additional metadata for federation:
Entity type keys: description of the keys that identify uniquely each entity for an entity type. The key can be a set of properties from the entity schema or it can be a unique global identifier property, like an URI. Entity types might have alternative identifiers.Typed entity references: standard way of referencing an instance of an entity that can be available in other services within the bounded context. References contain just the type of the entity and the key attributes.
It is worth mentioning that the referencing mechanism is different from traditional hypermedia links that connect directly two services. References connect them indirectly, through the common entity type and set of key attributes. Unlike links, references do not break if a target service is removed.
Given support for this federation metadata in the source services, the schema of the federated API can be computed using three main federation patterns:
Entity extension: computing the final set of attributes and operations for a type of entity in the federated API schema as the union of all the attributes and operations exposed for that type of entity by all the services in the bounded context sharing the same key.Entity linking: connecting two types of entities in the federated schema if a reference exists for that type in any of the services of the bounded contextEntity composition: simple composition of the schema when services in the bounded context expose information about a type of entity but don’t support keys. This is the case, for example, with “value objects,” like “MonetaryAmount”, where the notion of entity identity does not apply. In this case, attributes become optional, and the final set of attributes obtained will depend on the entry-point to the federated API
In this way, the bounded context accessible through a federated API becomes the basic organizational unit used to decompose the API landscape of an organization. Techniques like Context Mapping, also originating in the DDD community and successfully embraced by micro-services practitioners, can be used to track the set of bounded contexts within an organization and the flow of information among them.
2. Feature-based API design and federated composition
The notion of bounded context and tools like context mapping help us think effectively about the whole API landscape of an organization, but we still need to understand how API federation affects the design of each of the services that are implemented within a context.
From the point of view of API federation, the key design challenge is to decide the right granularity of the individual services. Since API federation introduces a mechanism to compose data and capabilities in an effective way, API teams can focus on more granular and modular “API features” that can be implemented and versioned independently and delegate the final generation of the interface for the whole bounded context to the federation layer.
An API feature, as defined by Ruben Verborgh’s et al. In their paper “A web-api ecosystem through feature based reuse,” is any type of atomic capability that can be described, discovered, and invoked in a consistent way across multiple APIs.
Instead of designing “monolithic” APIs that expose all types of functionality over a set of data in an inconsistent way across services, feature-based design proposes a bottom-up approach, where interfaces are decomposed into modular “features” that can be versioned independently and exposed with a standard contract across services and teams.
The unified HTTP “CRUD” interface can be considered one of these features, but others, like pagination, sorting and filtering of attributes, free-text search, data-change events, bulk-load, etc can be defined.
API clients can discover these features and depend only on individual feature contracts without becoming coupled to the whole API version, as is the case with non-modular monolithic APIs.
Individual teams can also work on providing different features in an independent way, using, for example, technologies like server-less functions, and the smaller granularity of the features also makes it easier to be consistent on the feature contract across teams.
Provided this modular approach to API design, the API Federation layer can take care of composing the individual API features, described in a consistent way, and expose them combined in the federated schema.
In the case of MuleSoft’s DataGraph, query is the initial federated capability that we are exposing in the Federated API. To achieve it, the individual services must support at least a specific API feature: entity-resolver operations.
Entity-resolution operations are operations in the service interface that, given the key of an entity in the model, return the data or associated capability for that entity.
Using information about the keys and type-resolver operations in each service, the federation infrastructure can execute a query expressed as GraphQL, extracting the right information from the federated services, for example, fetching the requested underlying resources for an entity, joining them by their primary keys and filtering the requested fields or following references encoded in other parts of the federated schema.
Support for additional API features at the source level, like filtering of fields, ‘expansion’ of nested entities, or full query support over the entities can make the support of query at the federated level more efficient, for example, making it possible to push sub-queries directly to the source. In this way, understanding the underlying features supported by the individual services in the bounded context can be used to negotiate dynamically the most efficient way of supporting the feature at the federated level.
The approach is by no means constrained to query support. Other features like search or data change events can also be offered at the federated level, providing the right metadata at the source level and runtime support at the federated level.
3. Consistency and evolution
So far we have described how the full API landscape can be split in a consistent bounded context and how individual API features in each context can be composed as a federated API providing a unified interface for all the services within the domain. The next thing to consider is how each of these federated APIs can change and evolve.
The main notion here is the definition of a “federation protocol,” a set of rules governing the federated API and encapsulating the contract that services within the bounded context must adhere to in order to be part of the federated API.
The federation protocol can be tuned to make the right choice for an organization between the ability to change the individual APIs and the degree of centralized governance over the federated schema. For example, it can be configured to automatically allow any change in the schema introduced by a change in an individual API, using mechanisms like auto-generated aliases for fields and entities that would otherwise result in an inconsistent schema or defaulting to entity composition if one service provides data and capabilities for an entity, but it cannot provide an entity resolver. On the opposite side, it could be tuned to require strict compliance with a centralized canonical schema for the bounded context. Organizations typically will endorse a federated governance model somewhere in between the extremes described before.
In the case of MuleSoft’s DataGraph, we have defined a flexible federation protocol that encourages a federated governance model that tries to maximize the usage of federation patterns like extension and references in the schema and requires manual intervention to resolve semantic conflicts:
We raise a conflict if a service contains an entity that is already federated with defined key and cannot provide a type-resolver operation to avoid defaulting to type compositionWe raise a conflict if an attribute is inconsistent with the definition in the federated schema without providing an automatic unique alias for the conflicting attribute
Individual teams can still change their APIs in any other way, introducing new attributes or even introducing breaking changes like removing attributes and entities. The federated schema will adapt automatically to these changes and reflect the new version of the federated API.
It is also important to note that the introduction of API federation can be gradual. It is not necessary to introduce keys, references, and type extension from the first moment. Entities can initially be added, using aliases if required, and then advanced federation patterns can be introduced progressively as potential inconsistencies among services are detected and resolved.
General rules about API evolution, like the recommendation of maintaining backwards compatibility over the schema, are still good practices even if the API is consumed through the federated API.
Finally, an interesting practice is to federate multiple versions of the service if the technology used at the federated level, like GraphQL, allows for explicit support for deprecation of entities and attributes. This can provide a guided path for clients of the federated schema to understand potential breaking changes in the federated API contract.
4. Runtime transparency
Providing an API Federation layer for a bounded context requires additional runtime logic to materialize the federated API features, but it is important for this federation runtime layer to be as transparent as possible and not to break the regular behavior of the remaining API features that are still offered by the individual services.
API runtime and management concerns like authorization, caching, SLAs/SLOs, rate limiting, etc. should still work as intended even if the service is accessed through the federated interface.
Teams owning the individual services must still be able to autonomously pick the right set of functional and non-functional policies for their services, and the API federation layer should interoperate seamlessly with these concerns using standard web and internet standards. For example, the federation layer should support standard HTTP caching or distributed tracing.
On the other hand, the introduction of federation at the schema and API feature level will require supporting some of the other aspects of the API in a federated way for the federated layer to work. Authentication and data identifiers are two examples.
Client identity and authentication: some kind of federated authentication capable of allowing clients to use a single credential and common authentication mechanism that can work across all the services being federated is required for the API federation runtime to orchestrate requests among them.Common data identifiers: shared identifiers for data entities within the bounded context are required for the extension and referencing federation patterns to work properly.
Finally, the introduction of the API federation layer should not impact the performance of the clients accessing the services exposed through the federated API. Moreover, the decentralized nature of API federation and the modular approach based on API features can make it possible to efficiently allocate resources to scale the specific services that can become bottlenecks for clients of the federated API.
5. Automation and API Operations
Adding the federation layer on top of the services and events of a bounded context will require adding federation as an additional concern for the release infrastructure in the API Software Development Life-cycle (SDLC) of an organization.
Depending on how the federated API is materialized, for example as a lightweight federation query gateway capturing north-south traffic for the bounded context as is the case in MuleSoft’s DataGraph, or as a sidecar in the client applications, metadata must be fetched and processed and the runtime capability must be deployed. In the same way, the federated API runtime must be updated with each release of one of the federated services, including consistency checks with the federation protocol.
This requires a robust and highly automated API SDLC, covering the CI and CD steps, that must also include support for cataloging and checking the metadata of the API contract of the services being federated.
As an example, MuleSoft’s DataGraph integrates with the rest of components of the Anypoint API platform to automate the support for API federation as much as possible, making use of the metadata capabilities of our API parsing infrastructure, catalog and governance as well as our integrated deployment, management, and runtime components to offer a fully managed federation layer offered as a service to our customers. This level of integration and automation allows us to offer DataGraph as a simple clicks-not-code experience that can be used to drive the whole process from the UI simplifying the governance experience of the federated API.
Additionally, DataGraph offers great logging and monitoring capabilities for the federated API essential to understand the behavior of the client applications, measure feature usage, and debug and troubleshoot complex integration scenarios.
Closing thoughts
Summing up, API Federation offers a tool to strategically think about how to deal with complex API landscapes in organizations that are adopting micro-services at scale. It builds on tested software engineering ideas about domain modeling and modularization and translates them into a set of design principles and tools that materialize those principles into an actionable interface for a whole bounded context.
API Federation also offers a way of addressing the complex trade-off between team autonomy and the ability to change services in a decentralized way on one hand and consistency and centralized enforcement of constraints on the other through the introduction of an explicit federation protocol governing the federated API.
Finally, API Federation can be used as leverage to advance the degree of automation and the sophistication of the infrastructure supporting the API SDLC of an organization and can be used to transform theoretical discussions on API best practices, design, and reuse into measurable benefits in the form of a consistent and unified federated interface.
If you are interested in learning how we are supporting this architectural pattern in MuleSoft through the introduction of DataGraph and want to give it a try, just dive into the documentation and start playing with the possibilities it opens.
API Federation: growing scalable API landscapes was originally published in Salesforce Engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.