Kubernetes Basics
Learn Kubernetes container orchestration from the ground up. Understand pods, deployments, services, and how to manage containerized applications at scale.
What is Kubernetes?
Kubernetes (often abbreviated as K8s) is an open-source container orchestration platform designed to automate the deployment, scaling, and management of containerized applications. Originally developed by Google based on their internal system called Borg, Kubernetes was released as open source in 2014 and is now maintained by the Cloud Native Computing Foundation (CNCF).
Kubernetes provides a framework to run distributed systems resiliently. It takes care of scaling and failover for your application, provides deployment patterns, and more. The name Kubernetes origines from Greek, meaning helmsman or pilot. K8s is an abbreviation derived from counting the eight letters between the "K" and the "s".
Why Use Kubernetes?
Containers are an excellent way to bundle and run applications, but managing containers at scale presents significant challenges. Kubernetes solves these challenges by providing a robust platform for container orchestration.
Key Benefits
- Service Discovery and Load Balancing - Kubernetes can expose containers using DNS names or IP addresses. When traffic to a container is high, Kubernetes can load balance and distribute network traffic to stabilize the deployment.
- Storage Orchestration - Kubernetes allows you to automatically mount storage systems of your choice, including local storage, public cloud providers, and network storage systems.
- Automated Rollouts and Rollbacks - You can describe the desired state for your deployed containers, and Kubernetes changes the actual state to the desired state at a controlled rate. It can automatically roll back changes if something goes wrong.
- Automatic Bin Packing - Kubernetes automatically places containers based on resource requirements and constraints, maximizing utilization without sacrificing availability.
- Self-Healing - Kubernetes restarts containers that fail, replaces and reschedules containers when nodes die, kills containers that do not respond to health checks, and does not advertise them to clients until they are ready.
- Secret and Configuration Management - Kubernetes lets you store and manage sensitive information like passwords, OAuth tokens, and SSH keys. You can deploy and update secrets and application configuration without rebuilding your container images.
Kubernetes Architecture
A Kubernetes cluster consists of a control plane and one or more worker nodes. The control plane manages the cluster, while worker nodes run the containerized applications.
Control Plane Components
kube-apiserver
The API server is the front end of the Kubernetes control plane. It exposes the Kubernetes API, which is used by all components to communicate. The API server processes REST requests, validates them, and updates the state in etcd. It is designed for horizontal scaling and can run multiple instances for high availability.
etcd
etcd is a consistent and highly-available key-value store used as Kubernetes' backing store for all cluster data. It stores the entire state of the cluster, including nodes, pods, configs, secrets, and more. Always have a backup plan for etcd data in production environments.
kube-scheduler
The scheduler watches for newly created Pods that have no node assigned, and selects a node for them to run on. Factors taken into account for scheduling include resource requirements, hardware and software constraints, affinity and anti-affinity specifications, and data locality.
kube-controller-manager
The controller manager runs controller processes. Each controller is a separate process, but to reduce complexity, they are compiled into a single binary. Common controllers include:
- Node Controller - Monitors node health and responds when nodes go down
- Replication Controller - Maintains the correct number of pods
- Endpoints Controller - Populates endpoint objects
- Service Account Controller - Creates default accounts for new namespaces
cloud-controller-manager
The cloud controller manager runs controllers specific to cloud providers. It links your cluster into your cloud provider's API and lets you interact with cloud-specific resources like load balancers and storage volumes.
Node Components
kubelet
The kubelet is an agent that runs on each node in the cluster. It ensures that containers are running in a Pod. The kubelet receives PodSpecs and ensures that the containers described in those PodSpecs are running and healthy. It does not manage containers not created by Kubernetes.
kube-proxy
kube-proxy is a network proxy that runs on each node. It maintains network rules that allow network communication to your Pods from inside or outside of your cluster. kube-proxy uses the operating system packet filtering layer if available; otherwise, it forwards the traffic itself.
Container Runtime
The container runtime is the software responsible for running containers. Kubernetes supports several container runtimes including containerd, CRI-O, and any implementation of the Kubernetes Container Runtime Interface (CRI). Docker support was deprecated in v1.20 and removed in v1.24, but Docker-built images still work with containerd.
Installing Kubernetes
Local Development Options
Minikube
Minikube runs a single-node Kubernetes cluster on your local machine, perfect for learning and development.
# Install minikube (macOS)
brew install minikube
# Start a cluster
minikube start
# Check status
minikube status
# Access the dashboard
minikube dashboard
# Stop the cluster
minikube stop
kind (Kubernetes in Docker)
kind runs Kubernetes clusters using Docker containers as nodes. It is primarily designed for testing Kubernetes itself but works well for local development.
# Install kind
brew install kind
# Create a cluster
kind create cluster --name my-cluster
# Get cluster info
kubectl cluster-info --context kind-my-cluster
# Delete cluster
kind delete cluster --name my-cluster
Docker Desktop
Docker Desktop includes a single-node Kubernetes cluster that can be enabled in settings. This provides an easy way to run Kubernetes alongside Docker on macOS and Windows.
Cloud Managed Kubernetes
- Amazon EKS - Elastic Kubernetes Service on AWS
- Google GKE - Google Kubernetes Engine on Google Cloud
- Azure AKS - Azure Kubernetes Service on Microsoft Azure
- DigitalOcean Kubernetes - Managed Kubernetes on DigitalOcean
- Linode Kubernetes Engine - Managed Kubernetes on Linode
Kubernetes Objects
Kubernetes objects are persistent entities in the Kubernetes system. They represent the state of your cluster. Understanding these objects is fundamental to working with Kubernetes.
Pods
A Pod is the smallest deployable unit in Kubernetes. It represents a single instance of a running process in your cluster. Pods can contain one or more containers that share storage and network resources.
apiVersion: v1
kind: Pod
metadata:
name: my-pod
labels:
app: web
spec:
containers:
- name: web
image: nginx:1.25
ports:
- containerPort: 80
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
Deployments
A Deployment provides declarative updates for Pods and ReplicaSets. You describe a desired state in a Deployment, and the Deployment Controller changes the actual state to the desired state at a controlled rate. Deployments manage the lifecycle of pods and ensure the specified number of replicas are running.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.25
ports:
- containerPort: 80
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
readinessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 15
periodSeconds: 20
Services
A Service is an abstraction that defines a logical set of Pods and a policy to access them. Services enable network access to a set of Pods, providing a stable endpoint regardless of individual Pod lifecycles.
Service Types
- ClusterIP - Default type. Exposes the Service on a cluster-internal IP. Only reachable within the cluster.
- NodePort - Exposes the Service on each Node's IP at a static port. Accessible from outside the cluster.
- LoadBalancer - Exposes the Service externally using a cloud provider's load balancer.
- ExternalName - Maps the Service to a DNS name.
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
type: ClusterIP
---
# NodePort example
apiVersion: v1
kind: Service
metadata:
name: nginx-nodeport
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
nodePort: 30080
type: NodePort
ConfigMaps
ConfigMaps allow you to decouple configuration artifacts from image content to keep containerized applications portable. ConfigMaps store non-confidential data in key-value pairs.
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
DATABASE_HOST: "db.example.com"
DATABASE_PORT: "5432"
LOG_LEVEL: "info"
app.properties: |
server.port=8080
server.timeout=30
Secrets
Secrets are similar to ConfigMaps but designed to hold confidential data such as passwords, tokens, and keys. Secrets are base64-encoded and can be encrypted at rest in etcd.
apiVersion: v1
kind: Secret
metadata:
name: db-secret
type: Opaque
data:
username: YWRtaW4= # base64 encoded
password: cGFzc3dvcmQxMjM= # base64 encoded
---
# Create secret from literal
# kubectl create secret generic db-secret --from-literal=username=admin --from-literal=password=secret
Namespaces
Namespaces provide a mechanism for isolating groups of resources within a single cluster. Resource names must be unique within a namespace but not across namespaces. Common use cases include separating environments (dev, staging, production) or teams.
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
environment: production
Working with kubectl
kubectl is the command-line tool for interacting with Kubernetes clusters. It communicates with the Kubernetes API server to manage resources.
Basic Commands
# Get cluster information
kubectl cluster-info
# View all resources
kubectl get all
# Get specific resources
kubectl get nodes
kubectl get pods
kubectl get deployments
kubectl get services
# Get resources in all namespaces
kubectl get pods --all-namespaces
kubectl get pods -A
# Get detailed information
kubectl describe pod pod-name
kubectl describe node node-name
# Apply configuration
kubectl apply -f deployment.yaml
# Delete resources
kubectl delete -f deployment.yaml
kubectl delete pod pod-name
kubectl delete deployment deployment-name
Debugging Commands
# View pod logs
kubectl logs pod-name
kubectl logs pod-name -c container-name # specific container
kubectl logs -f pod-name # follow logs
kubectl logs --previous pod-name # previous instance
# Execute command in pod
kubectl exec -it pod-name -- /bin/bash
kubectl exec pod-name -- ls /app
# Port forwarding
kubectl port-forward pod-name 8080:80
kubectl port-forward svc/service-name 8080:80
# Copy files
kubectl cp pod-name:/path/to/file ./local-file
kubectl cp ./local-file pod-name:/path/to/file
Scaling and Updates
# Scale deployment
kubectl scale deployment nginx-deployment --replicas=5
# Update image
kubectl set image deployment/nginx-deployment nginx=nginx:1.26
# Rollout history
kubectl rollout history deployment/nginx-deployment
# Rollback
kubectl rollout undo deployment/nginx-deployment
kubectl rollout undo deployment/nginx-deployment --to-revision=2
# Restart deployment
kubectl rollout restart deployment/nginx-deployment
Ingress
An Ingress is an API object that manages external access to services in a cluster, typically HTTP/HTTPS. Ingress can provide load balancing, SSL termination, and name-based virtual hosting.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
ingressClassName: nginx
tls:
- hosts:
- example.com
secretName: tls-secret
rules:
- host: example.com
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api-service
port:
number: 80
- path: /
pathType: Prefix
backend:
service:
name: frontend-service
port:
number: 80
Persistent Storage
Persistent Volumes (PV)
A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes.
Persistent Volume Claims (PVC)
A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a Pod in that Pods consume node resources and PVCs consume PV resources.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: standard
---
# Using PVC in a Pod
apiVersion: v1
kind: Pod
metadata:
name: pod-with-pvc
spec:
containers:
- name: app
image: nginx
volumeMounts:
- mountPath: "/data"
name: my-storage
volumes:
- name: my-storage
persistentVolumeClaim:
claimName: my-pvc
Helm Package Manager
Helm is the package manager for Kubernetes. It helps you manage Kubernetes applications through Helm Charts, which are packages of pre-configured Kubernetes resources.
# Add a repository
helm repo add bitnami https://charts.bitnami.com/bitnami
# Update repositories
helm repo update
# Search for charts
helm search repo nginx
# Install a chart
helm install my-nginx bitnami/nginx
# List releases
helm list
# Upgrade a release
helm upgrade my-nginx bitnami/nginx --set replicaCount=3
# Uninstall a release
helm uninstall my-nginx
Monitoring and Logging
Prometheus and Grafana
Prometheus is the standard monitoring solution for Kubernetes, scraping metrics from pods and services. Grafana provides visualization dashboards for Prometheus metrics.
Key Metrics to Monitor
- Pod CPU and memory usage
- Pod restart count
- Node resource utilization
- API server latency
- etcd health
- Application-specific metrics
Kubernetes Best Practices
Resource Management
- Always set resource requests and limits for containers
- Use horizontal pod autoscaling (HPA) for automatic scaling
- Implement pod disruption budgets (PDB) for high availability
- Use resource quotas to prevent resource exhaustion
Security
- Follow the principle of least privilege with RBAC
- Use network policies to control pod-to-pod traffic
- Run containers as non-root users
- Scan container images for vulnerabilities
- Use secrets for sensitive data, never hardcode in manifests
- Enable audit logging
Configuration
- Use namespaces to organize resources
- Apply labels consistently for filtering and selection
- Use ConfigMaps and Secrets for configuration
- Store manifests in version control
- Use GitOps for deployment automation
High Availability
- Run multiple replicas of critical applications
- Use pod anti-affinity to spread pods across nodes
- Implement proper health checks (liveness and readiness probes)
- Use multiple availability zones for node distribution
Troubleshooting
Common Issues
Pod in Pending State
# Check events
kubectl describe pod pod-name
# Common causes:
# - Insufficient resources
# - Node selector/affinity not matching
# - Volume not available
Pod in CrashLoopBackOff
# Check logs
kubectl logs pod-name --previous
# Check events
kubectl describe pod pod-name
# Common causes:
# - Application error
# - Missing configuration
# - Health check failure
Service Not Accessible
# Check endpoints
kubectl get endpoints service-name
# Check service selector matches pod labels
kubectl get pods --show-labels
kubectl describe service service-name
Conclusion
Kubernetes is the industry standard for container orchestration, providing a robust platform for deploying, scaling, and managing containerized applications. While the learning curve can be steep, understanding the core concepts of pods, deployments, services, and configuration management provides a solid foundation for working with Kubernetes at any scale.
# Get cluster info
kubectl cluster-info
# List all nodes
kubectl get nodes
# List all pods in default namespace
kubectl get pods
# List all pods in all namespaces
kubectl get pods --all-namespaces
# Describe a pod
kubectl describe pod <pod-name>
# View pod logs
kubectl logs <pod-name>
# Execute command in a pod
kubectl exec -it <pod-name> -- /bin/bash
# Apply a configuration
kubectl apply -f deployment.yaml
# Delete resources
kubectl delete -f deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.21
ports:
- containerPort: 80
resources:
limits:
memory: "128Mi"
cpu: "500m"
requests:
memory: "64Mi"
cpu: "250m"
Test Your Knowledge
Answer these questions to check your understanding