Security Contexts, Seccomp, and AppArmor: Container Runtime Security

Security Contexts, Seccomp, and AppArmor#

Security contexts control what a container can do at the Linux kernel level: which user it runs as, which syscalls it can make, which files it can access, and whether it can escalate privileges. These settings are your last line of defense when a container is compromised. A properly configured security context limits the blast radius of a breach by preventing an attacker from escaping the container, accessing the host, or escalating to root.

Spot Instances and Preemptible Nodes: Running Kubernetes on Discounted Compute

Spot Instances and Preemptible Nodes#

Spot instances are unused cloud capacity sold at a steep discount – typically 60-90% off on-demand pricing. The trade-off: the cloud provider can reclaim them with minimal notice. AWS gives a 2-minute warning, GCP gives 30 seconds, and Azure varies. Running Kubernetes workloads on spot instances is one of the most effective cost reduction strategies available, but it requires architecture that tolerates sudden node loss.

SSH Hardening and Management: Key Management, Bastion Hosts, and SSH Certificates

SSH Key Management#

SSH keys replace password authentication with cryptographic key pairs. The choice of algorithm matters:

Ed25519 (recommended): Based on elliptic curve cryptography. Produces small keys (256 bits) that are faster and more secure than RSA. Supported by OpenSSH 6.5+ (2014) – virtually all modern systems.

ssh-keygen -t ed25519 -C "user@hostname"

RSA 4096 (legacy compatibility): Use only when connecting to systems that do not support Ed25519. Always use 4096 bits – the default 3072 is adequate but 4096 provides a safety margin.

Structured Skill Definitions: Describing What Agents Can Do

Structured Skill Definitions#

When an agent has access to dozens of tools, it needs more than names and descriptions to use them well. It needs to know what inputs each tool expects, what outputs it produces, what other tools or infrastructure must be present, and how expensive or risky a call is. A structured skill definition captures all of this in a machine-readable format.

Why Not Just Use Function Signatures?#

Function signatures tell you the types of parameters. They do not tell you that a skill requires kubectl to be installed, takes 10-30 seconds to run, needs cluster-admin permissions, and might delete resources if called with the wrong flags. Agents making autonomous decisions need this information up front, not buried in documentation they may not read.

Synthetic Monitoring: Proactive Uptime Checks, Blackbox Exporter, and External Probing

What Synthetic Monitoring Is#

Synthetic monitoring means actively probing your services on a schedule rather than waiting for users to report problems. Instead of relying on internal health checks or real user traffic to detect issues, you send controlled requests and measure the results. The fundamental question it answers is: “Is my service reachable and responding correctly right now?”

This is distinct from real user monitoring (RUM), which observes actual user interactions. Synthetic probes run 24/7 regardless of traffic volume, so they catch outages at 3 AM when no users are active. They provide consistent, repeatable measurements that are easy to alert on. The tradeoff is that synthetic probes test a narrow, predefined path – they do not capture the full range of user experience.

systemd Service Management: Units, Timers, Journal, and Socket Activation

Unit Types#

systemd manages the entire system through “units,” each representing a resource or service. The most common types:

  • service: Daemons and long-running processes (nginx, postgresql, your application).
  • timer: Scheduled execution, replacing cron. More flexible and better integrated with logging.
  • socket: Network sockets that trigger service activation on connection. Enables lazy startup and zero-downtime restarts.
  • target: Groups of units that represent system states (multi-user.target, graphical.target). Analogous to SysV runlevels.
  • mount: Filesystem mount points managed by systemd.
  • path: Watches filesystem paths and activates units when changes occur.

Unit files live in three locations, in order of precedence: /etc/systemd/system/ (local admin overrides), /run/systemd/system/ (runtime, non-persistent), and /usr/lib/systemd/system/ (package-installed defaults). Always put custom units in /etc/systemd/system/.

Taints, Tolerations, and Node Affinity: Controlling Pod Placement

Taints, Tolerations, and Node Affinity#

Pod scheduling in Kubernetes defaults to “run anywhere there is room.” In production, that is rarely what you want. GPU workloads should land on GPU nodes. System components should not compete with application pods. Nodes being drained should stop accepting new work. Taints, tolerations, and node affinity give you control over where pods run and where they do not.

Taints: Repelling Pods from Nodes#

A taint is applied to a node and tells the scheduler “do not place pods here unless they explicitly tolerate this taint.” Taints have three parts: a key, a value, and an effect.

Terraform State Management Patterns

Why Remote State#

Terraform stores the mapping between your configuration and real infrastructure in a state file. By default this is a local terraform.tfstate file. That breaks the moment a second person or a CI pipeline needs to run terraform apply. Remote state solves three problems: team collaboration (everyone reads the same state), CI/CD access (pipelines need state without copying files), and disaster recovery (your laptop dying should not lose your infrastructure mapping).

TLS Deep Dive: Certificate Chains, Handshake, Cipher Suites, and Debugging Connection Issues

The TLS Handshake#

Every HTTPS connection starts with a TLS handshake that establishes encryption parameters and verifies the server’s identity. The simplified flow for TLS 1.2:

Client                              Server
  |── ClientHello ──────────────────>|   (supported versions, cipher suites, random)
  |<────────────────── ServerHello ──|   (chosen version, cipher suite, random)
  |<──────────────── Certificate  ──|   (server's certificate chain)
  |<───────────── ServerKeyExchange ─|   (key exchange parameters)
  |<───────────── ServerHelloDone  ──|
  |── ClientKeyExchange ───────────>|   (client's key exchange contribution)
  |── ChangeCipherSpec ────────────>|   (switching to encrypted communication)
  |── Finished ────────────────────>|   (encrypted verification)
  |<──────────── ChangeCipherSpec ──|
  |<──────────────────── Finished ──|
  |<═══════ Encrypted traffic ═════>|

TLS 1.3 simplifies this significantly. The client sends its key share in the ClientHello, allowing the handshake to complete in a single round trip. TLS 1.3 also removed insecure cipher suites and compression, eliminating entire classes of vulnerabilities (BEAST, CRIME, POODLE).

Vertical Pod Autoscaler (VPA): Right-Sizing Resource Requests Automatically

Vertical Pod Autoscaler (VPA)#

Horizontal scaling adds more pod replicas. Vertical scaling gives each pod more (or fewer) resources. VPA automates the vertical side by watching actual CPU and memory usage over time and adjusting resource requests to match reality. Without it, teams guess at resource requests during initial deployment and rarely revisit them, leading to either waste (over-provisioned) or instability (under-provisioned).

What VPA Does#

VPA monitors historical and current resource usage for pods in a target Deployment (or StatefulSet, DaemonSet, etc.) and produces recommendations for CPU and memory requests. Depending on the configured mode, it either reports these recommendations passively or actively applies them by evicting and recreating pods with updated requests.