Bare Metal Kubernetes Series - Part 4: Setting up Storage

For our cluster to work at its full capability we need some form of distributed storage. Distributed storage simply means that it's multiple machines and drives that take care of storage.

Implementing "no single point of failure" in this context means that even if a node and all it's drives goes down, we still are fully functional.


On the surface it seems pretty complex and reading the documentation they throw a lot of jargon at you without providing some fundamental context.

At the core Ceph offers three interfaces from which you can recieve and send data.

Object storage daemon (OSD)

For every device / partion allocated to the cluster, ceph will spawn a OSD.

ceph-osd is the object storage daemon for the Ceph distributed file system. It is responsible for storing objects on a local file system and providing access to them over the network.

Block devices

When Ceph talks about a block they mean a "block of bytes". You can define the number of bytes per block when creating a block device. Leave it at default until you encounter performance issues.

With "device" they mean device as in a hard disk something you can find under /dev/ using linux. In windows it would be something like your hard disk mountet at C:/ or your USB Drive at D:/ etc.

From the view point of the OS a "block device" is like any other storage device that can be mounted. Under the hood, cephs drivers assemble the block device using it's OSDs.

Like any other physical device a block device also can be formatted with a filesystem (like ext4 or nfts). Since these filesystems are not cluster aware by default, they can only be mounted by a single pod at one time.

Object Storage

Based on AWS S3 Object Storage, only backed by your personal ceph cluster.

It's basically binary object storage, with helper libraries for almost every programming language out there.

Shared file system

There is ceph and then there is cephfs. CephFS is a filesystem layered on top of a Block Device. This extra layer allows access by multiple clients at the same time (just like the classic NFS).

Ceph native vs Rook

Although there is the public Rook operator that deploys ceph directly to Kuberentes with just a few commands, I'd advise against using it. From personal experience, Rook is not production ready, meaning there are problems that you will encounter that are either incredibly hard, or impossible to fix (due to the extra layer of complexity provided by the immature operator).

An "operator" is kubernetes standarterized way to extend the kubernetes api. If at any point you want to automate some process in a way that exceeds the core kubernetes functionality, you can write an operator. However this is a topic for another time

Ceph itself on the other hand is production ready. Installing it manually gives you lots of control (which is what you will need).

Bootstrapping the cluster

Series Overview

Tobias Hübner