CQRS and Event Sourcing

CQRS and Event Sourcing#

CQRS (Command Query Responsibility Segregation) and event sourcing are frequently discussed together but are independent patterns. You can use either one without the other. Understanding each separately is essential before combining them.

CQRS: Separate Read and Write Models#

Most applications use the same data model for reads and writes. The same orders table serves the API that creates orders and the API that lists them. This works until read and write requirements diverge significantly.

Agent Memory and Retrieval: Patterns for Persistent, Searchable Agent Knowledge

Agent Memory and Retrieval#

An agent without memory repeats mistakes, forgets context, and relearns the same facts every session. An agent with too much memory wastes context window tokens on irrelevant history and retrieves noise instead of signal. Effective memory sits between these extremes – storing what matters, retrieving what is relevant, and forgetting what is stale.

This reference covers the concrete patterns for building agent memory systems, from simple file-based approaches to production-grade retrieval pipelines.

Message Queue Selection and Patterns

Message Queue Selection and Patterns#

Every microservice architecture eventually needs asynchronous communication. Synchronous HTTP calls between services create tight coupling, cascading failures, and latency chains. Message queues decouple producers from consumers, absorb traffic spikes, and enable event-driven workflows. The hard part is picking the right one.

Core Concepts That Apply Everywhere#

Before comparing specific systems, understand the delivery guarantees they can offer:

  • At-most-once: The message might be lost, but it is never delivered twice. Fast, no overhead, acceptable for metrics or logs where occasional loss is tolerable.
  • At-least-once: The message is guaranteed to arrive, but might arrive more than once. The consumer must handle duplicates (idempotency). This is the most common choice.
  • Exactly-once: The message arrives exactly once. This is extremely hard to achieve in distributed systems. Kafka offers it within its ecosystem via transactional producers and consumers, but end-to-end exactly-once across system boundaries requires idempotent consumers anyway.

Ordering matters too. Some systems guarantee order within a partition or queue. Others provide no ordering at all. If your consumers process messages out of order, you need to handle that in application logic or choose a system that preserves order.

Rate Limiting Implementation Patterns

Rate Limiting Implementation Patterns#

Rate limiting controls how many requests a client can make within a time period. It protects services from overload, ensures fair usage across clients, prevents abuse, and provides a mechanism for graceful degradation under load. Every production API needs rate limiting at some layer.

Algorithm Comparison#

Fixed Window#

The simplest algorithm. Divide time into fixed windows (e.g., 1-minute intervals) and count requests per window. When the count exceeds the limit, reject requests until the next window starts.