Writing a Docker Volume Plugin for CephFS
Estimated reading time: 7 mins
Currently we are evaluating Ceph for our Docker/Kubernetes on-premise cluster for persistent volume storage. Kubernetes officially supports CephRBD and CephFS as storage volume driver. Docker does not offer a Docker Volume plugin for CephFS currently.
But there are some plugins available online. A Google search comes up with a handful plugins that supports the CephFS protocol but the results are quite old (> 2 years) and outdated or they are using too much dependencies like direct Ceph cluster communication.
This blog post will be a little longer, as it is necessary to provide some basic facts about Ceph and because there are some odd pitfalls during the Plugin creation. Without the great Docker Volume Plugin for SSHFS which is written by Victor Vieux it won’t be possible for me to get the clue about the Docker Volume Plugin structure! Thank you for your work!
Source code of the Docker Volume Plugin for CephFS can be found here.
Basically Ceph is a storage platform that provides three types of storage:
RBD (Rados Block Device),
CephFS (Shared Filesystem) and
ObjectStorage(S3 compatible protocol). Beside this, Ceph offers some API interfaces to operate the Ceph storage remotely. Usually the mounting of the RBD and CephFS is enabled by installing the Ceph client part into your Linux machine via APT, YUM or whatever available. This client side software will install a Linux kernel module which can be used for a classic
mount command like
mount -t ceph .... Alternatively the use of
fuse is also possible. The usage of the client side bindings can be tricky, when different versions of the Ceph Cluster (eg Minic release) and Ceph Client (eg Luminous) are in use. This may lead to the situation where someone creates a
RBD device which has a newer feature set than the client which may lead to a non mountable file system.
RBD devices are meant to be exclusively mounted by exactly one end system, like a container which is pretty clear as you would also never share a physical device between two end systems.
RBD block devices therefore cannot be shared between multiple containers. Most of the RBD volume plugins are able to create such a device during the creation of a volume if it does not exist. This means that the plugin must be able to communicate with the Ceph Cluster either via the installed Ceph Client software on the server or via the implementation of one of the Ceph API libraries.
CephFS is a shared filesystem which is backed by the Ceph cluster and which can be shared between multiple end systems like any other shared file system you may know. It has some nice features like file system paths which can be authorised separately.
The Kubernetes Persistent Volume documentation contains a matrix about the different file systems and which modes (ReadWriteOnce, ReadOnlyMany, ReadWriteMany) they support.
Docker Volume Plugin Anatomy
Due to the great work of Victor Vieux I was able to get used to the anatomy of the Docker Volume plugin as the official Docker documentation is a little bit, uhm, short. I’am not a programmer ( Especially the docker GitHub go-plugin-helpers repository contains a lot of useful stuff and in sum I was able to copy/paste/change the plugin within a day.
api.go file of the plugin helper contains the interface method description which needs to be implemented by a plugin.
Some words about the interface:
List are used retrieve the information about a volume and to list the volumes powered by the volume plugin when someone executes
docker volume ls.
Create creates the volume with the volume plugin but it will not call the mount command at this time. The volume is only created and nothing more.
Mount is called when a container, which will use created volume, starts.
Path is used to track the mount paths for the container.
Unmount is called when the containers stops.
Remove is called, when the deletion of the volume is requested.
Capabilities is used to describe the needed capabilities of the Docker volume plugin, for example
net=host if the plugin needs network communication.
Beside this, every plugin contains a config.json file which describes the configuration (and capabilities) of the plugin.
The plugin itself must use a special file structure, called
Howto write the plugin
OK, I admit, I just copied the Docker Volume SSHFS plugin :-) and after that I did the following (beside learning the structure):
1) I changed the config.json of the plugin and removed all the things that my plugin does not need 2) I changed the functions mentioned above to reflect the needs of my plugin 3) I packed together everything, test it, uploaded it.
For point 1) and 2), this is just programming and configuring. But 3) is more interesting because the are the pitfalls an this pitfalls are described in the following section.
Pitfall 1 Vendors
The first thing I did during the development was to refresh the vendors. And this was also my first problem, at it was not possible to get the Plugin up and running. There is a little bug in the
api.go of the helper. The
CreatedAt cannot be JSON encoded if it empty. There is already a GitHub PR for it, which simply adds the needed annotations to the config. You can use the PR or you just add the needed annotations to the struct like this:
Pitfall 2 Make
The SSHFS Docker Volume is great! Make yourself life easier and use the provided Makefile! You can create the plugin rootfs with it (
make rootfs) and you can easily create the plugin with it (
Pitfall 3 Push
After I’ve done all the work I uploaded the source code to GitLab and created a pipeline to push the resulting Docker Container image to Docker Hub so everyone can use it. But this won’t work. After fiddling around an hour, I had the eye opener. The command
docker plugin has a separate
push function. So you have to use
docker plugin push to push a Docker plugin to Docker Hub!
Be aware: The Docker push repository must not exist before your fist push! If you create a repository manually or you push a Container into it, it will be flagged as Container repository and you can never ever push a plugin to it! The error message will be
denied: requested access to the resource is denied.
To be able to push the plugin, it must be installed (at least created) in your local Docker engine. Otherwise you cannot push it!
Pitfall 4 Wrong Docker image
Be aware that you use the correct Docker image if you are writing a plugin. If you build your binary with Ubuntu, you might not be able to run it inside your final Docker Volume Plugin container because the image you use is based on Alpine (or the other way around)
Pitfall 5 Unresolved dependencies
Be sure to include all you dependencies in your Docker image build process. For example: If you need the gluster-client, you will have to install them in your Dockerfile to have the dependencies in place when the Docker Volume Plugin image is loaded by the container engine.
Pitfall 6 Linux capabilities
Inside the Docker Plugin configuration, you have to specify all Linux capabilities you need for your plugin. If you miss a capability, the plugin will not do what you like that it does. Eg:
A word about debugging a Docker Volume Plugin. Beside the information you get from the Docker site (debug via docker socket), I found it helpful to just use the resulting Docker Volume image as a normal Container via
docker run. This gives you the ability to test if the Docker image is including all the stuff that you can do what you want with your plugin later. If you go this way, you have to use the correct
docker run options with all the capabilities, devices and the
privileged flag. Yes, Docker Volume Plugins run privileged! Here is a example command:
docker run -ti --rm --privileged --net=host --cap-add SYS_ADMIN --device /dev/fuse myrootfsimage bash. After this, test if all features are working.
Thats all! If you have questions, just contact me via the various channels. -M