Traefik + Docker Swarm DNS Resolution Issue
To Reproduce
-
Install Dokploy on a single-node Docker Swarm setup.
-
Deploy an application as a Docker Swarm service (e.g. frontend app listening on port 3000).
-
Expose the application through Traefik using the service name as backend (default Dokploy behavior):
http://<service-name>:3000
-
Access the application via a domain routed through Traefik.
-
Observe intermittent timeouts / 404 errors.
-
Inside the Traefik container, resolve the service name:
getent ahostsv4 <service-name>
-
Notice that Docker DNS returns multiple IPs, including a stale VIP.
-
Traefik randomly selects one of the returned IPs and may hit the non-routable VIP, causing timeouts.
Current vs. Expected Behavior
Expected Behavior
Traefik should consistently route traffic to a healthy backend container when using a Docker Swarm service as an upstream.
Current Behavior
Traefik intermittently times out because Docker DNS resolves the service name to:
- a valid task IP (working)
- a stale service VIP (not routable in this setup)
Traefik may select the VIP, resulting in connection timeouts.
Environment Information
Operating System:
Debian GNU/Linux 13 (trixie)
Kernel:
6.17.2-2-pve
Architecture:
x86_64 / amd64
Docker Engine:
Docker Engine – Community 28.5.0
API version: 1.51
Containerd: v2.2.1
runc: 1.3.4
Docker mode:
Docker Swarm active
Single-node cluster
Managers: 1
Nodes: 1
Node address: 10.202.20.128
Dokploy:
Image: dokploy/dokploy:latest
Running as Docker Swarm service
(single replica on the same node)
Traefik:
Image: traefik:v3.6.1
Version: 3.6.1
Codename: ramequin
OS/Arch: linux/amd64
Deployed and managed by Dokploy
Deployment type:
Applications are deployed on the same server where Dokploy is installed
Application type:
Frontend application built with Nixpacks
Served by Caddy web server
Internal listening port: 3000
Deployed as a Docker Swarm service
Affected Area(s)
Deployment Location
Applications are deployed on the same server where Dokploy is installed.
Technical Investigation (Commands & Findings)
-
Application is healthy inside the container:
docker exec -it <task-container> curl http://127.0.0.1:3000
# HTTP/1.1 200 OK
-
Application is reachable via task IP:
docker exec -it <task-container> curl http://<task-ip>:3000
# HTTP/1.1 200 OK
-
Traefik can reach the task IP directly:
docker exec -it dokploy-traefik wget http://<task-ip>:3000
# HTTP/1.1 200 OK
-
Service name resolves to multiple IPs:
docker exec -it dokploy-traefik getent ahostsv4 <service-name>
Example output:
10.0.1.43 # service VIP (stale / not routable)
10.0.1.62 # active task IP
-
VIP IP does NOT belong to any container:
docker network inspect dokploy-network | grep 10.0.1.43
# no output
-
Enabling DNSRR does not remove VIP from DNS:
docker service update --endpoint-mode dnsrr <service-name>
DNS still returns both IPs:
-
Using tasks.<service-name> resolves only real task IPs:
docker exec -it dokploy-traefik nslookup tasks.<service-name> 127.0.0.11
Output:
-
Traefik successfully connects using tasks DNS:
docker exec -it dokploy-traefik wget http://tasks.<service-name>:3000
# HTTP/1.1 200 OK
Root Cause
Docker Swarm DNS resolves a service name to both:
- the service VIP
- the task IPs
In single-node Swarm setups (especially with published ports), the VIP may be non-functional.
Traefik may randomly select this VIP, leading to intermittent routing failures.
Workaround / Solution
Configure Traefik backends to use task-specific DNS instead of the service name:
http://tasks.<service-name>:<port>
This guarantees that Traefik routes traffic only to active task containers and avoids stale VIPs entirely.
Affected Components
- Traefik
- Docker Swarm service discovery
- Dokploy auto-generated Traefik configuration
Additional Context
This issue is reproducible on a clean single-node Swarm installation and disappears immediately when switching Traefik upstreams from <service-name> to tasks.<service-name>.
Will You Send a PR to Fix It?
No
Traefik + Docker Swarm DNS Resolution Issue
To Reproduce
Install Dokploy on a single-node Docker Swarm setup.
Deploy an application as a Docker Swarm service (e.g. frontend app listening on port
3000).Expose the application through Traefik using the service name as backend (default Dokploy behavior):
Access the application via a domain routed through Traefik.
Observe intermittent timeouts /
404errors.Inside the Traefik container, resolve the service name:
Notice that Docker DNS returns multiple IPs, including a stale VIP.
Traefik randomly selects one of the returned IPs and may hit the non-routable VIP, causing timeouts.
Current vs. Expected Behavior
Expected Behavior
Traefik should consistently route traffic to a healthy backend container when using a Docker Swarm service as an upstream.
Current Behavior
Traefik intermittently times out because Docker DNS resolves the service name to:
Traefik may select the VIP, resulting in connection timeouts.
Environment Information
Affected Area(s)
Deployment Location
Applications are deployed on the same server where Dokploy is installed.
Technical Investigation (Commands & Findings)
Application is healthy inside the container:
Application is reachable via task IP:
Traefik can reach the task IP directly:
Service name resolves to multiple IPs:
Example output:
VIP IP does NOT belong to any container:
Enabling DNSRR does not remove VIP from DNS:
DNS still returns both IPs:
Using
tasks.<service-name>resolves only real task IPs:Output:
Traefik successfully connects using tasks DNS:
Root Cause
Docker Swarm DNS resolves a service name to both:
In single-node Swarm setups (especially with published ports), the VIP may be non-functional.
Traefik may randomly select this VIP, leading to intermittent routing failures.
Workaround / Solution
Configure Traefik backends to use task-specific DNS instead of the service name:
This guarantees that Traefik routes traffic only to active task containers and avoids stale VIPs entirely.
Affected Components
Additional Context
This issue is reproducible on a clean single-node Swarm installation and disappears immediately when switching Traefik upstreams from
<service-name>totasks.<service-name>.Will You Send a PR to Fix It?
No