Kubernetes Service Types and DNS-Based Discovery

Kubernetes Service Types and DNS-Based Discovery#

Services are the stable networking abstraction in Kubernetes. Pods come and go, but a Service gives you a consistent DNS name and IP address that routes to the right set of pods. Choosing the wrong Service type or misunderstanding DNS discovery is behind a large percentage of connectivity failures.

Service Types#

ClusterIP (Default)#

ClusterIP creates an internal-only virtual IP. Only pods inside the cluster can reach it. This is what you want for internal communication between microservices.

Nginx Configuration Patterns for Production

Configuration File Structure#

Nginx configuration follows a hierarchical block structure. The main configuration file is typically /etc/nginx/nginx.conf, which includes files from /etc/nginx/conf.d/ or /etc/nginx/sites-enabled/.

# /etc/nginx/nginx.conf
user nginx;
worker_processes auto;           # Match CPU core count
error_log /var/log/nginx/error.log warn;
pid /run/nginx.pid;

events {
    worker_connections 1024;     # Per worker process
    multi_accept on;             # Accept multiple connections at once
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format main '$remote_addr - $remote_user [$time_local] '
                    '"$request" $status $body_bytes_sent '
                    '"$http_referer" "$http_user_agent" '
                    '$request_time $upstream_response_time';

    access_log /var/log/nginx/access.log main;

    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 2048;
    client_max_body_size 16m;

    include /etc/nginx/conf.d/*.conf;
}

The worker_processes auto directive sets one worker per CPU core. Each worker handles worker_connections simultaneous connections, so total capacity is worker_processes * worker_connections. For most production deployments, auto with 1024 connections per worker is sufficient.

Observability Stack Troubleshooting: Diagnosing Prometheus, Alertmanager, Grafana, and Pipeline Failures

“I’m Not Seeing Metrics” – Systematic Diagnosis#

This is the most common observability complaint. Work through these steps in order to isolate where the pipeline breaks.

Step 1: Is the Target Being Scraped?#

Open the Prometheus UI at /targets. Search for the job name or target address. Look at three things: state (UP or DOWN), last scrape timestamp, and error message.

Status: UP    Last Scrape: 3s ago    Duration: 12ms    Error: (none)
Status: DOWN  Last Scrape: 15s ago   Duration: 0ms     Error: connection refused

If the target does not appear at all, Prometheus does not know about it. This means the scrape configuration (or ServiceMonitor) is not matching the target. Jump to the ServiceMonitor checklist at the end of this guide.

Scenario: Debugging Kubernetes Network Connectivity End-to-End

Scenario: Debugging Kubernetes Network Connectivity End-to-End#

The report comes in as it always does: “my application can’t reach another service.” This is one of the most common and most frustrating categories of Kubernetes issues because the networking stack has multiple layers, and the symptom (timeout, connection refused, 502) tells you almost nothing about which layer is broken.

This scenario walks through a systematic diagnostic process, starting from the symptom and narrowing down to the root cause. Follow these steps in order. Each step either identifies the problem or eliminates a layer from the investigation.

Elasticsearch and OpenSearch: Indexing, Queries, Cluster Management, and Performance

Elasticsearch and OpenSearch: Indexing, Queries, Cluster Management, and Performance#

Elasticsearch and OpenSearch are distributed search and analytics engines built on Apache Lucene. They excel at full-text search, log aggregation, metrics storage, and any workload that benefits from inverted indices. Understanding index design, mappings, query mechanics, and cluster management separates a working setup from one that collapses under production load.

Elasticsearch vs OpenSearch#

OpenSearch is the AWS-maintained fork of Elasticsearch, created after Elastic changed its license from Apache 2.0 to the Server Side Public License (SSPL) in early 2021. For the vast majority of use cases, the two are interchangeable. APIs are compatible, concepts are identical, and most tooling works with both. OpenSearch Dashboards replaces Kibana. This guide applies to both unless explicitly noted.

Load Balancer Patterns: L4 vs L7, Health Checks, Session Affinity, and Cloud LB Selection

L4 vs L7 Load Balancing#

The distinction between Layer 4 and Layer 7 load balancing determines what the load balancer can see and what routing decisions it can make.

Layer 4 (Transport) load balancers work at the TCP/UDP level. They see source/destination IPs and ports but not the content of the traffic. They forward raw TCP connections to backends. This makes them fast (no protocol parsing), protocol-agnostic (works for HTTP, gRPC, database connections, custom protocols), and transparent (the backend sees the original packets, mostly). Use L4 for database connections, raw TCP services, and when you need maximum throughput with minimum latency.

Minikube Networking: Services, Ingress, DNS, and LoadBalancer Emulation

Minikube Networking: Services, Ingress, DNS, and LoadBalancer Emulation#

Minikube networking behaves differently from cloud Kubernetes in ways that cause confusion. LoadBalancer services do not get external IPs by default, the minikube IP may or may not be directly reachable from your host depending on the driver, and ingress requires specific addon setup. Understanding these differences prevents hours of debugging connection timeouts to services that are actually running fine.

How Minikube Networking Works#

Minikube creates a single node (a VM or container depending on the driver) with its own IP address. Pods inside the cluster get IPs from an internal CIDR. Services get ClusterIPs from another internal range. The bridge between your host machine and the cluster depends entirely on which driver you use.