Why I avoid Docker Swarm in 2025

March 4, 2025

This is unpopular advice and that’s fine.

Docker Swarm has the property where everything looks easy until it isn’t. docker stack deploy reads a compose file, spreads the services across nodes, you get rolling updates and a built-in overlay network. For a homelab or three-node setup, that’s enough. Past that, the cracks show:

The overlay network uses VXLAN with a single in-kernel encrypt key; turning encryption on costs ~30–40% throughput on most NICs.
Service discovery is DNS-round-robin on a virtual IP. When a task dies, DNS TTL is set to 0 inside the cluster, but the client’s libc resolver caches anyway. Hilarity ensues.
Rolling updates don’t drain gracefully unless the container honors SIGTERM with a clean exit code, which a lot of third-party images don’t.

The “I’ll just write a couple of compose files” plan inevitably becomes “I’m debugging IPVS in a production outage.”

Kubernetes is heavier on day one, but the day-10 picture is better: liveness/readiness/startup probes are first-class, you get a real service mesh option when you outgrow kube-proxy, and the operator ecosystem solves most of the boring stuff (postgres, redis, cert issuance) for you.

If your scale is one host, use systemd + Restart=always. If it’s two hosts, use systemd + a load-balancer in front. Past that, go straight to k3s or k0s. Don’t do the in-between thing.