Saga Pattern: Choreography, Orchestration, and Compensating Transactions

Saga Pattern#

In a monolith, a single database transaction can span multiple operations atomically. In microservices, each service owns its database. There is no distributed transaction that works reliably across services. The saga pattern solves this by breaking a transaction into a sequence of local transactions, each with a corresponding compensating transaction that undoes its work if a later step fails.

The Problem: No Distributed ACID#

Consider an order placement that must: (1) reserve inventory, (2) charge payment, (3) create shipment. In a monolith, this is one transaction. In microservices, these are three services with three databases. Two-phase commit (2PC) across these is fragile, slow, and most message brokers and modern databases do not support it across service boundaries.

Event-Driven Architecture for Microservices

Event-Driven Architecture for Microservices#

In a microservices architecture, services need to communicate. The two fundamental approaches are synchronous (request-response) and asynchronous (event-driven). Most systems use both – the decision is which interactions should be synchronous and which should be event-driven.

Synchronous vs Asynchronous Communication#

Synchronous (request-response): Service A calls Service B and waits for a response. Simple, familiar, and works well when A needs the response to continue. The cost is temporal coupling – if B is down, A fails.

Message Queue Selection and Patterns

Message Queue Selection and Patterns#

Every microservice architecture eventually needs asynchronous communication. Synchronous HTTP calls between services create tight coupling, cascading failures, and latency chains. Message queues decouple producers from consumers, absorb traffic spikes, and enable event-driven workflows. The hard part is picking the right one.

Core Concepts That Apply Everywhere#

Before comparing specific systems, understand the delivery guarantees they can offer:

  • At-most-once: The message might be lost, but it is never delivered twice. Fast, no overhead, acceptable for metrics or logs where occasional loss is tolerable.
  • At-least-once: The message is guaranteed to arrive, but might arrive more than once. The consumer must handle duplicates (idempotency). This is the most common choice.
  • Exactly-once: The message arrives exactly once. This is extremely hard to achieve in distributed systems. Kafka offers it within its ecosystem via transactional producers and consumers, but end-to-end exactly-once across system boundaries requires idempotent consumers anyway.

Ordering matters too. Some systems guarantee order within a partition or queue. Others provide no ordering at all. If your consumers process messages out of order, you need to handle that in application logic or choose a system that preserves order.