Notes & Tools

Why I avoid Docker Swarm in 2025

This is unpopular advice and that’s fine.

Docker Swarm has the property where everything looks easy until it isn’t. docker stack deploy reads a compose file, spreads the services across nodes, you get rolling updates and a built-in overlay network. For a homelab or three-node setup, that’s enough. Past that, the cracks show:

  • The overlay network uses VXLAN with a single in-kernel encrypt key; turning encryption on costs ~30–40% throughput on most NICs.
  • Service discovery is DNS-round-robin on a virtual IP. When a task dies, DNS TTL is set to 0 inside the cluster, but the client’s libc resolver caches anyway. Hilarity ensues.
  • Rolling updates don’t drain gracefully unless the container honors SIGTERM with a clean exit code, which a lot of third-party images don’t.

The “I’ll just write a couple of compose files” plan inevitably becomes “I’m debugging IPVS in a production outage.”

Kubernetes is heavier on day one, but the day-10 picture is better: liveness/readiness/startup probes are first-class, you get a real service mesh option when you outgrow kube-proxy, and the operator ecosystem solves most of the boring stuff (postgres, redis, cert issuance) for you.

If your scale is one host, use systemd + Restart=always. If it’s two hosts, use systemd + a load-balancer in front. Past that, go straight to k3s or k0s. Don’t do the in-between thing.