Learned today: Why DNS failed in Docker swarm containers

Our hosts are in 10.x.x.x network, and the DNS server IP is 10.0.0.2. Now when we added an overlay network in Docker swarm without setting the subnet explicitly, Docker assigned the same 10.0.0.0/255 subnet to the overlay network. When the container tried to resolve a name, it was not able to talk to the 10.0.0.2 DNS server because IP packets got routed to the overlay network.

This was fixed by defining the subnets for overlay networks explicitly in another address range.

It was this GitHub issue that shed the light.

Learned today: Go templates

I built a library for loading configuration files that are actually Go templates, with some custom template functions for including values from other configuration files and loading secrets from files mounted by docker secrets.

If it turns out to be usable maybe I can open-source it some day, but for now it is proprietary.

Workaround for gitlab-runner issue #2408 “Cannot connect to the Docker daemon”

TL;DR; Change your pull policy to “if-not-present” or “never”.

Our Continuous Deployment pipeline at Wysiwyg worked fine, until it stopped working. It started giving errors like this:

Pulling docker image registry.gitlab.com/wysiwygoy/dev/cd-deploy ...
ERROR: Preparation failed: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

A bit of googling resulted in gitlab-runner issue #2408. I tried some of the suggestions in the thread but couldn’t get it working.

Finally I found a workaround: Because it fails when pulling my custom image from gitlab.com registry, I changed the pull policy of the runner (in config.toml of the runner) to “if-not-present”. The executor then skips pulling the image and executes its actual job just fine.

I suspect that gitlab.com registry responds too slowly and the docker client library gives up with the error mentioned above.

The downside of course is that if I update my executor image I need to pull the new image manually. In practice it doesn’t happen that often, so I can live with that until the fix to gitlab-runner is in.

I posted my finding as a comment to the issue.