---
title: "Knative: Serverless on Kubernetes"
description: "Knative Serving for autoscaling to zero, Knative Eventing for event-driven architectures, traffic splitting, and custom domain configuration on Kubernetes."
url: https://agent-zone.ai/knowledge/serverless/knative-serverless-kubernetes/
section: knowledge
date: 2026-02-22
categories: ["serverless"]
tags: ["knative","kubernetes","serverless","autoscaling","eventing","scale-to-zero","traffic-splitting"]
skills: ["knative-serving-deployment","knative-eventing-configuration","traffic-management","serverless-kubernetes"]
tools: ["kubectl","kn","knative-serving","knative-eventing","istio","kourier"]
levels: ["intermediate"]
word_count: 1244
formats:
  json: https://agent-zone.ai/knowledge/serverless/knative-serverless-kubernetes/index.json
  html: https://agent-zone.ai/knowledge/serverless/knative-serverless-kubernetes/?format=html
  api: https://api.agent-zone.ai/api/v1/knowledge/search?q=Knative%3A+Serverless+on+Kubernetes
---


# Knative: Serverless on Kubernetes

Knative brings serverless capabilities to any Kubernetes cluster. Unlike managed serverless platforms, you own the cluster -- Knative adds autoscaling to zero, revision-based deployments, and event-driven invocation on top of standard Kubernetes primitives. This gives you the serverless developer experience without vendor lock-in.

Knative has two independent components: **Serving** (request-driven compute that scales to zero) and **Eventing** (event routing and delivery). You can install either or both.

## Installing Knative

Knative requires a networking layer. The two primary options are Istio (full service mesh, heavier) and Kourier (lightweight, Knative-specific).

```bash
# Install Knative Serving CRDs and core
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.14.0/serving-crds.yaml
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.14.0/serving-core.yaml

# Install Kourier as the networking layer
kubectl apply -f https://github.com/knative/net-kourier/releases/download/knative-v1.14.0/kourier.yaml

# Configure Knative to use Kourier
kubectl patch configmap/config-network \
  --namespace knative-serving \
  --type merge \
  --patch '{"data":{"ingress-class":"kourier.ingress.networking.knative.dev"}}'

# Install Knative Eventing
kubectl apply -f https://github.com/knative/eventing/releases/download/knative-v1.14.0/eventing-crds.yaml
kubectl apply -f https://github.com/knative/eventing/releases/download/knative-v1.14.0/eventing-core.yaml
```

Verify the installation:

```bash
kubectl get pods -n knative-serving
kubectl get pods -n knative-eventing
```

## Knative Serving

### Services

A Knative Service is the primary resource. It manages the full lifecycle: creating a Configuration, which creates Revisions, which are backed by Kubernetes Deployments. You work with the Service; Knative handles everything underneath.

```yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: hello-app
  namespace: default
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/minScale: "0"
        autoscaling.knative.dev/maxScale: "10"
        autoscaling.knative.dev/target: "100"
    spec:
      containers:
      - image: gcr.io/my-project/hello-app:v1
        ports:
        - containerPort: 8080
        env:
        - name: LOG_LEVEL
          value: "info"
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 500m
            memory: 256Mi
        readinessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 3
```

Apply it and Knative creates everything:

```bash
kubectl apply -f hello-app.yaml

# Check the service
kubectl get ksvc hello-app

# Output includes the URL
# NAME        URL                                    LATESTCREATED     LATESTREADY       READY
# hello-app   http://hello-app.default.example.com   hello-app-00001   hello-app-00001   True
```

### Revisions

Every change to the Service template creates a new Revision. Revisions are immutable snapshots of your configuration -- the container image, environment variables, resource limits, and scaling annotations at the time of creation.

```bash
# List revisions
kubectl get revisions

# NAME              CONFIG NAME   GENERATION   READY
# hello-app-00001   hello-app     1            True
# hello-app-00002   hello-app     2            True
```

Old revisions are not deleted automatically. They remain available for traffic splitting and rollback. To clean them up, delete them explicitly or configure a retention policy.

### Routes and Traffic Splitting

Routes determine which revisions receive traffic and in what proportion. By default, 100% of traffic goes to the latest ready revision. You can split traffic across revisions for canary deployments or gradual rollouts.

```yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: hello-app
spec:
  template:
    metadata:
      name: hello-app-v2
    spec:
      containers:
      - image: gcr.io/my-project/hello-app:v2
  traffic:
  - revisionName: hello-app-v1
    percent: 80
  - revisionName: hello-app-v2
    percent: 20
```

This sends 80% of traffic to v1 and 20% to v2. To promote v2 fully:

```yaml
  traffic:
  - revisionName: hello-app-v2
    percent: 100
```

**Tag-based routing** gives named URLs to specific revisions for testing before routing real traffic:

```yaml
  traffic:
  - revisionName: hello-app-v2
    percent: 0
    tag: staging
  - revisionName: hello-app-v1
    percent: 100
    tag: current
```

This creates `staging-hello-app.default.example.com` pointing to v2 with zero production traffic. QA can validate the new version at that URL, then you adjust percentages to roll it out.

### Autoscaling

Knative uses the Knative Pod Autoscaler (KPA) by default, which supports scale-to-zero. The alternative is the Kubernetes Horizontal Pod Autoscaler (HPA), which does not scale to zero but supports CPU and memory-based scaling.

Key autoscaling annotations:

```yaml
metadata:
  annotations:
    # Autoscaler class: kpa.autoscaling.knative.dev (default) or hpa.autoscaling.knative.dev
    autoscaling.knative.dev/class: "kpa.autoscaling.knative.dev"

    # Target concurrent requests per pod (default: 100)
    autoscaling.knative.dev/target: "50"

    # Metric: concurrency (default) or rps (requests per second)
    autoscaling.knative.dev/metric: "concurrency"

    # Scale bounds
    autoscaling.knative.dev/minScale: "0"
    autoscaling.knative.dev/maxScale: "20"

    # Scale-to-zero grace period (how long to wait with zero traffic before scaling down)
    autoscaling.knative.dev/scale-to-zero-pod-retention-period: "60s"

    # Scale-down delay (prevents flapping)
    autoscaling.knative.dev/scale-down-delay: "30s"

    # Initial scale when coming from zero
    autoscaling.knative.dev/initial-scale: "1"
```

**Scale-to-zero behavior:** When no requests arrive for the configured grace period, Knative scales the deployment to zero replicas. The next request hits the activator (a Knative component that buffers requests), which triggers pod creation. The first request after scale-to-zero experiences a cold start -- container pull, startup, and readiness probe passing.

To reduce cold start impact, set `minScale: "1"` for latency-critical services or use `initial-scale` to bring up multiple pods immediately when scaling from zero.

### Global Autoscaling Configuration

Cluster-wide defaults are set in the `config-autoscaler` ConfigMap:

```bash
kubectl edit configmap config-autoscaler -n knative-serving
```

```yaml
data:
  container-concurrency-target-default: "100"
  enable-scale-to-zero: "true"
  scale-to-zero-grace-period: "30s"
  scale-to-zero-pod-retention-period: "0s"
  stable-window: "60s"
  panic-window-percentage: "10"
  panic-threshold-percentage: "200"
```

The **panic mode** kicks in when traffic suddenly spikes. If observed concurrency exceeds the panic threshold (2x the target by default) within the panic window, Knative scales aggressively using a shorter observation window to react faster.

## Knative Eventing

Eventing provides infrastructure for routing events from producers to consumers. The core abstractions are Sources (where events come from), Brokers (event routing hubs), and Triggers (subscriptions that filter and deliver events).

### Brokers and Triggers

A Broker is a named event bus within a namespace. Triggers filter events from the broker and route them to subscribers.

```yaml
apiVersion: eventing.knative.dev/v1
kind: Broker
metadata:
  name: default
  namespace: my-app
  annotations:
    eventing.knative.dev/broker.class: MTChannelBasedBroker
```

Create triggers that subscribe to specific event types:

```yaml
apiVersion: eventing.knative.dev/v1
kind: Trigger
metadata:
  name: order-processor
  namespace: my-app
spec:
  broker: default
  filter:
    attributes:
      type: com.myapp.order.created
      source: /orders/api
  subscriber:
    ref:
      apiVersion: serving.knative.dev/v1
      kind: Service
      name: order-processor
---
apiVersion: eventing.knative.dev/v1
kind: Trigger
metadata:
  name: notification-sender
  namespace: my-app
spec:
  broker: default
  filter:
    attributes:
      type: com.myapp.order.created
  subscriber:
    ref:
      apiVersion: serving.knative.dev/v1
      kind: Service
      name: notification-sender
```

Both triggers fire on the same event type. The order-processor trigger additionally filters on the source attribute. Events are delivered as HTTP POST requests in CloudEvents format.

### Event Sources

Sources produce events and send them to a sink (typically a Broker or a Knative Service).

**PingSource (cron-based):**

```yaml
apiVersion: sources.knative.dev/v1
kind: PingSource
metadata:
  name: hourly-cleanup
spec:
  schedule: "0 * * * *"
  contentType: "application/json"
  data: '{"action": "cleanup", "target": "expired-sessions"}'
  sink:
    ref:
      apiVersion: eventing.knative.dev/v1
      kind: Broker
      name: default
```

**ApiServerSource (Kubernetes events):**

```yaml
apiVersion: sources.knative.dev/v1
kind: ApiServerSource
metadata:
  name: pod-events
spec:
  serviceAccountName: pod-watcher
  mode: Resource
  resources:
  - apiVersion: v1
    kind: Pod
  sink:
    ref:
      apiVersion: serving.knative.dev/v1
      kind: Service
      name: pod-event-handler
```

**KafkaSource (consume from Kafka topics):**

```yaml
apiVersion: sources.knative.dev/v1beta1
kind: KafkaSource
metadata:
  name: order-events
spec:
  consumerGroup: knative-consumer
  bootstrapServers:
  - kafka-bootstrap.kafka:9092
  topics:
  - orders
  sink:
    ref:
      apiVersion: eventing.knative.dev/v1
      kind: Broker
      name: default
```

## Custom Domains

By default, Knative generates URLs using `{service}.{namespace}.{domain}`. Configure your custom domain in the `config-domain` ConfigMap:

```bash
kubectl edit configmap config-domain -n knative-serving
```

```yaml
data:
  myapp.example.com: ""
```

This makes all services in the cluster available under `myapp.example.com`. For per-namespace or per-service overrides:

```yaml
data:
  myapp.example.com: |
    selector:
      app: production
```

Only services with the label `app: production` use this domain. Configure your DNS to point `*.myapp.example.com` to the Kourier or Istio ingress gateway's external IP.

For HTTPS, configure the `config-network` ConfigMap with `auto-tls: Enabled` and install cert-manager with an appropriate ClusterIssuer. Knative will automatically provision TLS certificates for your services.

## Practical Considerations

**When Knative makes sense:** You have a Kubernetes cluster already. You want serverless scaling semantics (especially scale-to-zero) for some workloads. You need event-driven architecture with CloudEvents compatibility. You want to avoid vendor lock-in to a specific cloud's serverless platform.

**When Knative adds unnecessary complexity:** You are running a single cloud provider and Lambda or Cloud Run meets your needs. You do not need scale-to-zero. Your team does not have Kubernetes expertise. The operational overhead of running Knative (CRDs, controllers, networking layer) is not justified by the workload.

**Debugging tips:**

```bash
# Check the Knative service status
kubectl get ksvc hello-app -o yaml | grep -A 20 status:

# Check the underlying pods
kubectl get pods -l serving.knative.dev/service=hello-app

# View activator logs (useful for scale-from-zero issues)
kubectl logs -n knative-serving -l app=activator -c activator

# View autoscaler logs
kubectl logs -n knative-serving -l app=autoscaler
```