Cost-Per-Pass, Not Cost-Per-Call: The Right Metric for Autonomous Agent Routing

Cost-Per-Pass, Not Cost-Per-Call#

Practitioners price LLMs by the per-token rate on the provider’s pricing page. For autonomous agents, that number is misleading. Two layers of indirection sit between the per-token rate and the cost you actually pay to get work done: variable prompt sizes turn per-token into per-call, and variable pass rates turn per-call into per-pass. Each layer can invert the ranking.

For autonomous fleets where failed attempts trigger reviewer cycles, retries, and reputational drag, cost-per-pass is the only metric that ranks models correctly. This article shows how to compute it, when it dominates, and where the cheapest-per-token model becomes the most expensive in production.

Temporal Namespaces and Task Queues: Organizing Workflows

Temporal Namespaces and Task Queues#

Namespaces and task queues are Temporal’s two primary organizational mechanisms. Namespaces provide isolation – separate history, retention, and access. Task queues route work to specific workers. Together, they determine where workflows run and how long their history is kept.

For the underlying architecture, see Introduction to Temporal.

Namespaces#

A namespace is a logical isolation boundary. Every workflow belongs to exactly one namespace. Namespaces provide history isolation (workflows cannot see across boundaries), independent retention policies, per-namespace search attributes, and scoped access control.

Alertmanager Configuration and Routing

Routing Tree#

Alertmanager receives alerts from Prometheus and decides where to send them based on a routing tree. Every alert enters at the root route and travels down the tree until it matches a child route. If no child matches, the root route’s receiver handles it.

# alertmanager.yml
global:
  resolve_timeout: 5m
  slack_api_url: "https://hooks.slack.com/services/T00/B00/xxx"
  pagerduty_url: "https://events.pagerduty.com/v2/enqueue"

route:
  receiver: "default-slack"
  group_by: ["alertname", "namespace"]
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
  routes:
    - match:
        severity: critical
      receiver: "pagerduty-oncall"
      group_wait: 10s
      repeat_interval: 1h
      routes:
        - match:
            team: database
          receiver: "pagerduty-dba"
    - match:
        severity: warning
      receiver: "team-slack"
      repeat_interval: 12h
    - match_re:
        namespace: "staging|dev"
      receiver: "dev-slack"
      repeat_interval: 24h

Timing parameters matter. group_wait is how long Alertmanager waits after receiving the first alert in a new group before sending the notification – this lets it batch related alerts together. group_interval is the minimum time before sending updates about a group that already fired. repeat_interval controls how often an unchanged active alert is re-sent.

API Gateway Patterns: Selection, Configuration, and Routing

API Gateway Patterns#

An API gateway sits between clients and your backend services. It handles cross-cutting concerns – authentication, rate limiting, request transformation, routing – so your services do not have to. Choosing the right gateway and configuring it correctly is one of the first decisions in any microservices architecture.

Gateway Responsibilities#

Before selecting a gateway, clarify which responsibilities it should own:

  • Routing – directing requests to the correct backend service based on path, headers, or method.
  • Authentication and authorization – validating tokens, API keys, or certificates before requests reach backends.
  • Rate limiting – protecting backends from traffic spikes and enforcing usage quotas.
  • Request/response transformation – modifying headers, rewriting paths, converting between formats.
  • Load balancing – distributing traffic across service instances.
  • Observability – emitting metrics, logs, and traces for every request that passes through.
  • TLS termination – handling HTTPS so backends can speak plain HTTP internally.

No gateway does everything equally well. The right choice depends on which of these responsibilities matter most in your environment.

Ingress Controllers and Routing Patterns

Ingress Controllers and Routing Patterns#

An Ingress resource defines HTTP routing rules – which hostnames and paths map to which backend Services. But an Ingress resource does nothing on its own. You need an Ingress controller running in the cluster to watch for Ingress resources and configure the actual reverse proxy.

Ingress Controllers#

The two most common controllers are nginx-ingress and Traefik.

nginx-ingress (ingress-nginx):

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm install ingress-nginx ingress-nginx/ingress-nginx --namespace ingress-nginx --create-namespace

Note: there are two different nginx ingress projects. kubernetes/ingress-nginx (community) and nginxinc/kubernetes-ingress (NGINX Inc). The community version is far more common. Make sure you install from https://kubernetes.github.io/ingress-nginx, not the NGINX Inc chart.