Build a Docker Swarm on AWS with Ansible in 1 minute and 47 seconds
Estimated reading time: 8 mins
Is it possible to build a five node (3 manager nodes, 2 worker nodes) Docker Swarm in under 2 minutes? Yes it is! Some weeks ago, Hennnig Jacobs who works at Zalando Technology, posted a Tweet where he referenced an article he wrote called “Why Kubernetes?”. This article covers another post, “Maybe You Don’t Need Kubernetes”, written by Matthias Endler who works at Trivago. There are always pros and cons for every solution, but I missed Docker Swarm in his article. And there was another thing that triggered my brain. He wrote that “[…] creating a cluster on DigitalOcean takes less than 4 minutes and is reasonably cheap ($30/month for 3 small nodes with 2 GiB and 1 CPU each).”. In addition, he wrote that at Zalando they “[…] run 100+ Kubernetes clusters […]”.
Therefore I asked myself how long it would take to setup a Docker Swarm cluster with 3 managers and 2 workers on AWS by myself and furthermore would it be possible to start 101 (100+) Docker Swarm clusters too? Short answer, yes it is 😎! But lets start with the idea. And as a side-note, “3 small nodes” are not a productive setup for Kubernetes, whereas 3 manager nodes and 2 worker nodes are a productive setup for Docker Swarm.
After some testing it was clear to me that I will use a script to run multiple Ansible Playbooks at the same time in parallel. I ended up with the following BASH scrtipt:
This script uses a variable $1 which is provided by an wrapper script (to create n-Docker Swarms) which is simple counter loop. First of all, I need the mm-node, the master-manager-node. The master manager node is the node, where the docker swarm init command will be issued, after the EC2 instances are created - see line 20. To make things easier, the script will wait on line 17 for all executes. The & after the code lines are indicating that these lines are running in parallel. In line 20 the join tokens for manager and worker joins are created and then, from line 23 to line 28 used to register the nodes as master or worker in parallel too.
To get this up and running smooth and fast, I have to use some tricks 🤩😁 - they might be useful out there!
Trick #1: “dynamic” inventory
The EC2 instances are using dynamic ip addresses. Therefore the script and also the Ansible Playbooks cannot rely on Ansible Inventories! There are dynamic inventory scripts for Ansible and AWS (and many others) out there and they are official supported, but they are often not that fast. Thankfully, there is ec2_instance_facts for AWS to filter (find) instances which meet certain requirements. If instances are found, we add them to a in memory Ansible Inventory. Look at the Ansible Playbook below, lines 10-23.
Trick #2: Name your instances
The second trick is, to tag the instances you create with names that are dynamic but predictable. We cannot only create one Docker Swarm cluster with this Anisble Playbook, we can create hundreds if we like. Look at the Ansible Playbook below, lines 14.
Trick #3: Save Ansible data (varibales) to a local file
This is huge! you can save Ansible output to a local file and afterwards you can load the data from this file to use it in another playbook. Look at the Ansible Playbook below, lines 40-44. If you are clever, you can create very smart playbooks. Like in this case, I save the Docker join tokes to files that are name according to the Docker Swarm cluster that is currently created. Therefore, you can use this information during the parallel creation of Docker swarms!
Trick #4: Load Ansible data (varibales) from a local file
It is easy (if you found it) to load Ansible data from local files saved previously. Look at the Ansible Playbook below, lines 30-38.
Trick #5: Use Ansible build in function
Ansible comes with a lot of handy functions. In this example, I use split to split up the number of the Docker Swarm this Ansible Playbook is running for to load the correct Docker Swarm join command. Line 32 below.
Here is the video of this run with 1 minute and 47 seconds.
Create 100+ Docker Swarms
AWS raised the limit of EC2 nano instances from 28 (default) to 550 - the only thing that you have to do for this is to open up a support ticket. The next problem is, that the BASH fork mechanism is really exhausting the resources of our ansible host - and this is OK as copying the Pyhton processes is expensive. Just for test, I’ve put in 32 cores and 64 GB of memory (VMware). With this configuration I was able to start the creation of 101 Docker Swarms but then I got a lock-down of the AWS-API - “Too many requests” 😂 Maybe, in the future I will try to create 10 or 20 Docker Swarm at a time to not reach this limit.
Ansible in combination with AWS and Docker Swarm is pretty awesome! It was a lot of fun to optimize the playbook run to get this up and running in parallel. I will upload the playbooks into a GitLab repository the next weeks. If you need it earlier, let me know!