Why I avoid Docker Swarm in 2025
This is unpopular advice and that’s fine.
Docker Swarm has the property where everything looks easy until it
isn’t. docker stack deploy reads a compose file, spreads the
services across nodes, you get rolling updates and a built-in
overlay network. For a homelab or three-node setup, that’s
enough. Past that, the cracks show:
- The overlay network uses VXLAN with a single in-kernel encrypt key; turning encryption on costs ~30–40% throughput on most NICs.
- Service discovery is DNS-round-robin on a virtual IP. When a task dies, DNS TTL is set to 0 inside the cluster, but the client’s libc resolver caches anyway. Hilarity ensues.
- Rolling updates don’t drain gracefully unless the container honors SIGTERM with a clean exit code, which a lot of third-party images don’t.
The “I’ll just write a couple of compose files” plan inevitably becomes “I’m debugging IPVS in a production outage.”
Kubernetes is heavier on day one, but the day-10 picture is better: liveness/readiness/startup probes are first-class, you get a real service mesh option when you outgrow kube-proxy, and the operator ecosystem solves most of the boring stuff (postgres, redis, cert issuance) for you.
If your scale is one host, use systemd + Restart=always. If it’s
two hosts, use systemd + a load-balancer in front. Past that, go
straight to k3s or k0s. Don’t do the in-between thing.