Skip to content

Operations

Scope

Production container operations, image management, networking, storage, security hardening, and monitoring.

Image Management

Build Optimization

# Multi-stage build (reduce image size by 80%+)
FROM node:22-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build

FROM node:22-alpine AS runtime
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
USER node
CMD ["node", "dist/main.js"]
Strategy Impact Notes
Multi-stage builds 50-90% size reduction Separate build and runtime stages
Alpine base images 70% smaller than Debian May have musl libc issues
.dockerignore Faster builds Exclude node_modules, .git, etc.
Layer caching 10x faster rebuilds Order COPY commands by change frequency
--mount=type=cache Persistent caches Package manager caches across builds

Image Security

# Scan for vulnerabilities
docker scout cves myimage:latest
trivy image myimage:latest

# Sign images
cosign sign --key cosign.key myregistry.io/myimage:latest

Container Runtime

Resource Limits

# Run with resource constraints
docker run -d \
  --memory=512m --memory-swap=1g \
  --cpus=2.0 \
  --pids-limit=100 \
  --read-only \
  --tmpfs /tmp:rw,noexec,nosuid,size=64m \
  myapp:latest

Health Checks

HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
  CMD curl -f http://localhost:8080/health || exit 1

Compose in Production

# docker-compose.prod.yml
services:
  web:
    image: myapp:${TAG}
    deploy:
      replicas: 3
      resources:
        limits:
          cpus: '2.0'
          memory: 512M
      restart_policy:
        condition: on-failure
        max_attempts: 3
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 5s
      retries: 3

Common Issues

Issue Diagnosis Fix
Container OOMKilled docker inspect --format='{{.State.OOMKilled}}' Increase memory limit or fix leak
Disk space exhausted docker system df docker system prune -a --volumes
DNS resolution fails docker exec -it app nslookup host Check Docker DNS (127.0.0.11)
Slow builds Layer cache invalidation Reorder Dockerfile, use BuildKit
Port conflict docker port <container> Change host port mapping

Monitoring

# Real-time resource usage
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}"

# Container logs
docker logs --since 1h --tail 100 -f <container>

# Events stream
docker events --since '2026-04-12T00:00:00' --filter type=container

Commands & Recipes

Essential CLI commands, Dockerfile patterns, and operational recipes.

Container Lifecycle

# Run a container
docker run -d --name myapp -p 8080:80 nginx:1.27

# Run with resource limits
docker run -d --name myapp \
  --cpus="2.0" --memory="512m" \
  --restart=unless-stopped \
  nginx:1.27

# Execute command inside running container
docker exec -it myapp /bin/sh

# View logs (follow + tail)
docker logs -f --tail 100 myapp

# View resource usage
docker stats myapp

# Stop gracefully (30s timeout) then force
docker stop -t 30 myapp
docker rm myapp

# Inspect container details (JSON)
docker inspect myapp | jq '.[0].NetworkSettings.IPAddress'

Image Management

# Build with BuildKit (multi-stage, cache)
DOCKER_BUILDKIT=1 docker build \
  --target production \
  --cache-from myregistry/myapp:cache \
  -t myapp:latest .

# Multi-platform build
docker buildx build \
  --platform linux/amd64,linux/arm64 \
  --push \
  -t myregistry/myapp:latest .

# Scan image for vulnerabilities
docker scout cves myapp:latest

# Prune unused images
docker image prune -a --filter "until=24h"

# Export and import images (air-gapped)
docker save myapp:latest | gzip > myapp.tar.gz
docker load < myapp.tar.gz

Docker Compose

# compose.yaml
services:
  web:
    image: nginx:1.27
    ports:
      - "8080:80"
    volumes:
      - ./html:/usr/share/nginx/html:ro
    depends_on:
      db:
        condition: service_healthy
    deploy:
      resources:
        limits:
          cpus: "1.0"
          memory: 256M

  db:
    image: postgres:17
    environment:
      POSTGRES_PASSWORD_FILE: /run/secrets/db_password
    volumes:
      - pgdata:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5
    secrets:
      - db_password

volumes:
  pgdata:

secrets:
  db_password:
    file: ./secrets/db_password.txt
# Lifecycle
docker compose up -d
docker compose ps
docker compose logs -f web
docker compose down -v  # remove volumes too

Networking

# Create custom network
docker network create --driver bridge --subnet 10.0.0.0/24 mynet

# Connect container to network
docker network connect mynet myapp

# Inspect network
docker network inspect mynet | jq '.[0].Containers'

# DNS resolution (between containers on same network)
docker exec myapp ping db  # resolves to container IP

Cleanup

# Nuclear cleanup (removes everything unused)
docker system prune -a --volumes

# Selective cleanup
docker container prune   # stopped containers
docker image prune -a    # unused images
docker volume prune      # unused volumes
docker network prune     # unused networks

# Check disk usage
docker system df -v

Dockerfile Best Practices

# syntax=docker/dockerfile:1

# ---- Build Stage ----
FROM golang:1.25-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN --mount=type=cache,target=/go/pkg/mod \
    go mod download
COPY . .
RUN --mount=type=cache,target=/root/.cache/go-build \
    CGO_ENABLED=0 go build -ldflags="-s -w" -o /app/server .

# ---- Runtime Stage ----
FROM gcr.io/distroless/static-debian12:nonroot
COPY --from=builder /app/server /server
EXPOSE 8080
USER nonroot:nonroot
ENTRYPOINT ["/server"]

Troubleshooting

# Debug container that won't start
docker run --rm -it --entrypoint /bin/sh myapp:latest

# Check OOM kills
docker inspect myapp | jq '.[0].State.OOMKilled'

# View filesystem changes
docker diff myapp

# Copy files out of container
docker cp myapp:/var/log/app.log ./app.log

# Monitor events
docker events --filter 'event=die' --filter 'event=oom'

Sources