On-Premise Gitlab with autoscale Docker machine Microsoft Azure runners

Sojourner

This post is about the experience we gained last week while we were expanding our on-premise Gitlab installation with autoscale Docker machine runners located in the Microsoft Azure cloud.

Preface

We are running a quite large Gitlab installation for our development colleagues since about five years. Since march this year we are also using our Gitlab installation for CI/CD/CD (continuous integration/continuous delivery/continuous deployment) with Docker. Since our developers started to love the flexibility and the power of Gitlab in combination with Docker, the build jobs are continuously raising. After approximately six month we have run nearly 7000 pipelines and nearly 15000 jobs.

Some of the pipelines are running quite long, for example Maven builds or multi Docker image builds (micro services). Therefore we are running out of local on-premise Gitlab runners. To be fair, this would not be a huge problem because we have a really huge VM-Ware environment on site but we want test the Gitlab autoscaling feature in a real world, real life environment.

Our company is a Microsoft enterprise customer and therefore we have the possibility to just test this things in a little bit different environment than it is usual.

Cloud differences

As told beforehand, we have a more sophisticated on-premise cloud integration. Currently we have a site-to-site connection to Microsoft. Therefore we are able to use the Microsoft Azure cloud as it would be an offside office which is reachable over the WAN (wide area network).

Gitlab autoscale runner configuration

At first glance we just followed the instructions from the Gitlab documentation. The documentation is fairly enough especially with the corresponding docker+machine documentation for the Azure driver.

Important: Persist the docker+machine data! Otherwise every time the Docker Gitlab runner container restarts, your information about the created cloud VM-Ware machines will be lost!

Gitlab runner configuration

Forward, we have to go back!

At this point, the troubles were starting.

Gitlab autoscale runner Microsoft Azure authentication

We decided to run the Gitlab autoscaling runner with the Gitlab runner offical Docker image, as we do with all our other runners. After the basic Gitlab runner configuration (as provided in the documentation), the docker+machine driver will try to connect with the Microsoft Azure API and will start to authenticate the given subscription. Because we are running the Gitlab as Docker container, you have to show the logs to notice the Microsoft login instructions.

Important: The Microsoft Azure cloud login information is stored inside a hidden folder in the /root directory of the Gitlab runner container! You should really persist the /root folder, or you will have to authenticate every time you start the runner!

Important: Use the MACHINE_STORAGE_PATH environment variable to know where docker-machine stores the virtual machine inventory! Otherwise you will lost it every time the container restarts!

If you lost this step, you will have to relogin. You will notice this problem, if you look at the log messages of the container. Every five seconds, you will see a login attempt. This login attempt is only valid for a short time period and every time you will have to enter a new login code online. You can workaround this dead-lock situation, if you open an interactive shell into the Docker Gitlab autoscale runner and enter the docker-machine ls command. This command will wait until you have entered the provided code online.

Docker Gitlab autoscale swarm configuration

Important: This is a swarm compose file!

Microsoft Azure cloud routing table is deleted

Because we are connecting to the Microsoft Azure cloud from within our WAN, there are special routing tables configured in the Microsoft Azure cloud, because the network traffic has to find the way to the cloud and back from it to our site. After we started the first virtual machine we recognized, that we cannot connect to it. There was no chance to establish a ssh connection. After some research we found out, that the routing table was deleted. After some more test and drilling down the docker+machine source code we were able to pin the problem down to one Microsoft Azure API function. To get rid of the problem we wrote a small patch.

The pull request is available at Github and furthermore, you can also find all information there. If you like, please vote up the patch.

Microsoft Azure cloud storage accounts are not deleted

After we sailed around the first problem, we immediately hit the next one. The Github autoscale runner do what he is meant to do and create virtual machines and delete them. But the storage accounts are left over. As a result, your Microsoft Azure resource group will get messed up with orphaned storage accounts.

Yes, you guess it, I wrote the next patch and submitted a pull request at Github. You can read up all details there. If you like, please vote up the patch there :). Thank you.

Conclusion

After a lot of hacking, we are proud to have a full working Gitlab autoscale runner configuration with Microsoft Azure on-premise integration up and running. Regarding the issues found, we have also contacted our Microsoft enterprise cloud consultant to let Microsoft know, that there might be a bug with the Microsoft Azure API.

Blog picture image

The blog picture of this post shows the Sojourner rover from the Nasa Pathfinder mission. In my opinion an appropriate picture, because you will never know what you have to face on uncharted territory.

Docker Endeavor – Episode 3 – Orbit

Challenger Orbit

General

It’s been two month since the last Docker Endeavor post but we weren’t lazy! In opposite, we build a lot of new stuff and changed a lot of things and of course we learned a lot too! In between I passed my master exam and therefore the last two month were really busy. Beside this, Bernhard and I met Niclas Mietz a fellow of our old colleague Peter Rossbach from Bee42. We met Niclas because we booked a Gitlab CI/CD workshop in Munich (in June) – and funny, Bernhard and I were the only ones attending this workshop! Therefore we have had a really good time with Niclas because we had the chance to ask him everything we wanted to know specifically for our needs!
Thanks to Bee42 and the DevOps Gathering that we were mentioned on Twitter – what a motivation to go on with this blog!
Also one of our fellows of the Container fathers, Kir Kolyshkin, we met him in 2009, is now working as a developer for Docker Twitter. We are very proud to know him!

Review from the last episode

In the last episode we talked about our ingress-controller, the border-controller and the docker-controller. For now we canceled the docker-controller & the ingress-controller because it adds too much complexity and we managed it to get up and running with a border-controller in conjunction with external created swarm networks and Docker Swarm internal DNS lookups.

Gitlab CI/CD/CD

Yes, we are going further! Our current productive environment is still powered by our work horse OpenVZ. But we are now also providing a handful of Docker Swarm Services in production / development & staging. To get both, CI (continuous integration) and CD/CD (continuous delivery / continuous deployment) up and running, we decided to basically support three strategies.

  • At first, we use Gitlab to create deployment setups for our department, DevOps. We’ve just transitioned our Apache Tomcat setup to a automatic Docker Image build powered by Gitlab. Based on this we created a transition repository where the developer could place his or her .war-package. This file afterwards gets bundled with our Docker Tomcat image, build beforehand, and then it is also pushed to our private Docker registry. Afterwards it will be deployed to the Docker Swarm. KISS – Keep it simple & stupid.

  • Second, the developers of our development department use Gitlab including the Gitlab runners to build a full CI pipeline, including Sonar, QF-GUI tests, Maven and so on.

  • Third, we have projects which are combining both, the CI and the CD/CD mechanisms. For productive and testing/staging environments.

Gitlab-CI-CD-Overview

Update of the border-controller

Our border-controller is now only using the Docker internal Swarm DNS service to configure the backends. We do not use the docker-controller anymore, therefore this project of us is deprecated. Furthermore, in the latest development version of our border-controller I’ve included the possibility to send the border-controller IP address to a PowerDNS server (via API). Thanks to our colleague Ilia Bakulin from Russia who is part of my team now! He did a lot of research and supported us to get this service up and running. We will need it in the future for dynamic DNS changes. If you are interested in this project, just have a look at our Github project site or directly use our border-controller Docker image from DockerHub. Please be patient, we are DevOps, not developers. 🙂

Currently we are not using Traefik for the border-controller because for us there are two main reasons.

  • First, our Nginx based border-controller does not need to be run on a Docker Swarm manager node, because we are not using the Docker socket interface with it. Instead we are using the build in Docker Swarm DNS service discovery to get the IP addresses for the backend configuration. This also implies, that we don’t need to mount the Docker socket into the border-controller.

  • Second, in the latest version the border-controller is able to use the PowerDNS API to automatically register the load balancers IP address and the DNS name in the PowerDNS system. That is important for the users point of view because normally they use a domain name in the browser.

Border-Controller-Overview

Actual Docker Swarm state

Currently we run approximately 155 containers.

Summary

In this blog we talked about CI/CD/CD pipelines and strategies with Gitlab and our own border-controller based on Nginx. In addition we gave you some information on what we did the last two month.

Orbit

The blog headline picture shows the Space Shuttle Challenger in orbit during the STS07-32-1702 mission (22 June 1983).

Challenger Orbit