Docker Swarm Network – Down the Rabbit Hole
Estimated reading time: 5 mins
Last week we tracked down a recurring problem with our Docker Swarm, more exactly with the Docker overlay network. To anticipate it, there is a merge which might fix this, but not for Docker-CE 18.03 . The pull mentioned is also not included in Docker-CE 18.06.1 but it is already merged into Moby and part of Docker-CE 18.09.0-ce-tp5 which means that the fix should be available with Docker-CE 18.09.
Description of the problem
If you try to start a container or if you have Docker Swarm which starts containers for you, you might see that the containers cannot start on specific hosts. If you take a look into the log files, you find lines like this:
This means, that a network VXLAN interface for a new container which would like to join a overlay network already exists.
The next sentences are not deeply scientific, they are more a sum up of multiple information an experience. As I understand, the startup sequence of a container (driven by dockerd) which uses a overlay network is as follows:
1) Create a VXLAN interface which uses the VXLAN id of the associated Docker network (
docker network create --driver=overlay ...) - at this point the VXLAN interface is visible on the host (
ip -d link show)
2) Then dockerd puts the VXLAN interface into the namespace of the container - at this point the VXLAN interface is not visible anymore on the host
3) When the container stops, the device is given back to the host
4) The device is deleted by the dockerd
Between 3) and 4) a race condition happens and the network device is not deleted.
The important hint to find out more was given by the user gitbensons on github - Kudos to him! He pointed out, that it is possible to find the already existing VXLAN device by running
strace against the
dockerd process. Here is the strace command to use just before starting an affected container.
THIS COMMAND IS DANGEROUS! IF YOU RUN IT FOR TOO LONG, YOU WILL PROBABLY KILL THE DOCKERD!!!
In the output of the previous command, you can see, that the affected device has the name
vx-00106c-clblt. The last five characters of the device name, in this example
clblt are specifying the affected overlay network id (short). Login to a Docker manger, run
docker network ls | grep clblt and you can find the name of the affected overlay network.
At this point we know, which VXLAN device is still there but shouldn’t. In the next step, just list all vx-* devices on the affected host by doing:
Ups. Now we have a problem. All of this devices are dead (
state DOWN) but where not deleted! This means that on this Docker host, it will not be possible to start containers which would like to join one of the affected overlay networks (look at the id’s).
After finding the problematic device, you can delete it with
ip link delete vx-00100f-drzik for example. Maybe it would be a good practice to delete all devices and to monitor your hosts if there are such devices, as is an indicator that something happens which will prevent starting further containers for the affected networks.
From the Urban dictionary: Rabbit Hole: Metaphor for the conceptual path which is thought to lead to the true nature of reality. Infinitesimally deep and complex, venturing too far down is probably not that great of an idea.
It is hard to accept, that the error message does not write which file is already existing. I know the cause is found in golang, because if you only print err, you will not get any information about which file is already existing. Writing which interface is already existing would be nice, or deleting it automatically on container start would be even nicer :-) But I won’t dig deeper, as there is already a merge … don’t forget Rabbit Holes are dangerous ;-)