---
title: "Pod Lifecycle and Probes: Init Containers, Hooks, and Health Checks"
description: "How Kubernetes manages pod startup, health checking, and graceful shutdown -- including init containers, probes, lifecycle hooks, and common misconfiguration pitfalls."
url: https://agent-zone.ai/knowledge/kubernetes/pod-lifecycle-and-probes/
section: knowledge
date: 2026-02-22
categories: ["kubernetes"]
tags: ["pods","probes","liveness","readiness","startup","init-containers","graceful-shutdown"]
skills: ["pod-health-configuration","graceful-shutdown-design","probe-debugging"]
tools: ["kubectl"]
levels: ["intermediate"]
word_count: 933
formats:
  json: https://agent-zone.ai/knowledge/kubernetes/pod-lifecycle-and-probes/index.json
  html: https://agent-zone.ai/knowledge/kubernetes/pod-lifecycle-and-probes/?format=html
  api: https://api.agent-zone.ai/api/v1/knowledge/search?q=Pod+Lifecycle+and+Probes%3A+Init+Containers%2C+Hooks%2C+and+Health+Checks
---


# Pod Lifecycle and Probes

Understanding how Kubernetes starts, monitors, and stops pods is essential for running reliable services. Misconfigurations here cause cascading failures, dropped requests, and restart loops that are difficult to diagnose.

## Pod Startup Sequence

When a pod is scheduled, this is the exact order of operations:

1. **Init containers** run sequentially. Each must exit 0 before the next starts.
2. All regular containers start **simultaneously**.
3. **postStart** hooks fire (in parallel with the container's main process).
4. **Startup probe** begins checking (if defined).
5. Once the startup probe passes, **liveness** and **readiness** probes begin.

## Init Containers

Init containers run before your application containers and are used for setup tasks: waiting for a dependency, running database migrations, cloning config from a remote source.

```yaml
spec:
  initContainers:
  - name: wait-for-db
    image: busybox:1.36
    command: ['sh', '-c', 'until nc -z postgres-svc 5432; do echo "waiting for db"; sleep 2; done']
  - name: run-migrations
    image: web-api:2.1.0
    command: ['./migrate', '--up']
    env:
    - name: DATABASE_URL
      valueFrom:
        secretKeyRef:
          name: db-credentials
          key: url
  containers:
  - name: web-api
    image: web-api:2.1.0
```

Init containers share the pod's volumes but have their own image and resource requests. If any init container fails, Kubernetes restarts the pod (subject to `restartPolicy`). They run to completion every time a pod starts -- including restarts.

## The Three Probes

### Startup Probe

The startup probe protects slow-starting containers. While it is running, liveness and readiness probes are disabled. Once it passes once, it never runs again for that container.

```yaml
startupProbe:
  httpGet:
    path: /healthz
    port: 8080
  failureThreshold: 30
  periodSeconds: 2
```

This gives the application 60 seconds (30 attempts x 2 seconds) to start. Use this for Java apps, apps that load large models, or anything with variable startup time. Without it, a liveness probe with a short timeout will kill the container before it finishes starting.

### Liveness Probe

The liveness probe tells Kubernetes whether the container is alive. If it fails, Kubernetes **kills and restarts** the container. It answers: "Is this process deadlocked or broken beyond recovery?"

```yaml
livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 15
  periodSeconds: 10
  timeoutSeconds: 3
  failureThreshold: 3
```

**Critical mistake: checking downstream dependencies in your liveness probe.** If your liveness probe checks whether the database is reachable, and the database goes down, Kubernetes will restart all your application pods -- making the outage worse. The liveness probe should only check whether your process is functioning, not whether its dependencies are up.

```go
// GOOD: liveness checks the process itself
func healthz(w http.ResponseWriter, r *http.Request) {
    w.WriteHeader(http.StatusOK)
}

// BAD: liveness checks the database -- causes cascading restarts
func healthz(w http.ResponseWriter, r *http.Request) {
    if err := db.Ping(); err != nil {
        w.WriteHeader(http.StatusServiceUnavailable)
        return
    }
    w.WriteHeader(http.StatusOK)
}
```

### Readiness Probe

The readiness probe controls whether the pod receives traffic from Services. If it fails, the pod is removed from Service endpoints but **not restarted**. It answers: "Can this pod handle requests right now?"

```yaml
readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  periodSeconds: 5
  timeoutSeconds: 2
  failureThreshold: 2
```

This is the right place to check dependencies. If the database is down, the readiness probe should fail so traffic stops flowing to this pod. The pod stays running and can recover when the dependency comes back.

**Design pattern:** Use `/healthz` for liveness (process check only) and `/ready` for readiness (process + critical dependencies).

## Lifecycle Hooks

### postStart

Runs immediately after the container starts, in parallel with the main process. The container is not marked Running until postStart completes. If it fails, the container is killed.

```yaml
lifecycle:
  postStart:
    exec:
      command: ["/bin/sh", "-c", "echo 'started' > /tmp/started"]
```

Avoid using postStart for anything slow -- it blocks the pod from becoming Ready.

### preStop

Runs when Kubernetes decides to terminate the pod (scale-down, node drain, deployment rollout). This is where you implement graceful shutdown.

```yaml
lifecycle:
  preStop:
    exec:
      command: ["/bin/sh", "-c", "sleep 5"]
```

## Graceful Shutdown

When Kubernetes terminates a pod, this happens:

1. Pod is marked Terminating. It is removed from Service endpoints.
2. **preStop hook runs** and **SIGTERM is sent to PID 1** -- these happen simultaneously.
3. Kubernetes waits up to `terminationGracePeriodSeconds` (default: 30s).
4. If the process is still running, SIGKILL is sent.

The problem: endpoint removal is asynchronous. The kube-proxy and ingress controllers may still route traffic to your pod for a few seconds after SIGTERM. The fix is a preStop sleep:

```yaml
spec:
  terminationGracePeriodSeconds: 45
  containers:
  - name: web-api
    image: web-api:2.1.0
    lifecycle:
      preStop:
        exec:
          command: ["/bin/sh", "-c", "sleep 5"]
```

The 5-second sleep in preStop gives the network time to drain connections. Your application should also handle SIGTERM by stopping acceptance of new connections and finishing in-flight requests. Set `terminationGracePeriodSeconds` high enough to cover the preStop delay plus your application's drain time.

## Complete Example

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web-api
  template:
    metadata:
      labels:
        app: web-api
    spec:
      terminationGracePeriodSeconds: 45
      initContainers:
      - name: wait-for-db
        image: busybox:1.36
        command: ['sh', '-c', 'until nc -z postgres-svc 5432; do sleep 2; done']
      containers:
      - name: web-api
        image: web-api:2.1.0
        ports:
        - containerPort: 8080
        startupProbe:
          httpGet:
            path: /healthz
            port: 8080
          failureThreshold: 30
          periodSeconds: 2
        livenessProbe:
          httpGet:
            path: /healthz
            port: 8080
          periodSeconds: 10
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          periodSeconds: 5
          failureThreshold: 2
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 5"]
```

## Debugging Probes

```bash
# See probe failures in events
kubectl describe pod web-api-6d4f8b7c9-x2k4m

# Common event messages:
# "Liveness probe failed: HTTP probe failed with statuscode: 503"
# "Readiness probe failed: connection refused"

# Check if a pod is in a restart loop
kubectl get pods -w
# RESTARTS column incrementing = liveness probe killing the container

# Test the probe endpoint manually
kubectl exec web-api-6d4f8b7c9-x2k4m -- curl -s localhost:8080/healthz
```

If RESTARTS keeps climbing, check whether your liveness probe is too aggressive (low timeout, low failure threshold) or is checking something it should not be (downstream dependencies).

