Archiv Januar 2019

How to Build A Private Storage Cluster (with Ceph)

Since my NAS (a QNAP TS-419P II) get more and more buggy, especially with non-working Windows shares and the painfully low processing power of the integrated ARM single core, wished something like a SAN for myself. But SAN is quite expensive, the peripherial hardware (Switches, UPS,…) not included. So I decided to skip a few levels and build up a NAS 2.0 storage cluster based on open source ceph using low-budget ODROID HC2 (Octa-Core 4 x Cortex-A15 + 4 x Cortex-A7) from Hardkernel as the work horse to create storage nodes. To make it even more dense, you can use the ODROID HC1 that is just the same but for 2.5″ disks (be aware of the power supply: HC2 = 12V, HC1 = 5V !!!).

If you don’t need a SATA drive (e.g. for the controlling nodes of the cluster: mgr, metadata, nfs, cifs,…), you can use the MC1, MC1 solo, XU4 or XU4Q.

If you want to go with x86 instead of ARM, the ODROID H2 looks like a great alternative, but it will also be a bit more expensive (e.g. RAM is not included).

In fact, installing ceph will be much less pain, going for 64-bit x86 than going with ARM 32 bit. I decided to go with ARM 32, because I want to build up the most energy efficient cluster, to maximize scale out capabilities also in sense of my private budget.

To build up the cluster, I currently use 4 x ODRID HC2 with WD Red 4 TB drives (WD40EFRX), also installing the ceph non-OSD-services distributed accross this little cluster. The BOM for my test cluster is as follows:

If powering up the cluster in sequence (not all at once), you could reduce the power requirements of the supply component a lot (currently 12V/2A per node, 5V/4A for HC1). I will dive into this topic a bit deeper in future. I think, it can be done in software by delaying the spin up through some bootarg. Nevertheless, an optimum solution would be to have a power distribution unit for switching and measuring the supply current and also providing some UPS capabilities on the low voltage path. Additionally, current measuments could give you the ability to regulate the power through e.g. cpufreq to optimize the efficiency of the cluster and the power supply.

To generate the debian packages for installing ceph on the nodes, follow the instructions here. When you have built the debian packages, move them over to some http(s) server, to be easily accessible by your nodes.

Your Own Debian Package Repository

To be accessible, a Debian Package Repository needs to be placed in a webserver’s directory accessible at least in your own network. It is best practice to secure this repository with SSL, since Debian APT more or less expects this… So first we start with creating a (self signed CA). Later, if needed, you can easily replace the certifciate by an official one or let an authorithy also sign your server certificate.

Generating a CA for SSL

This part is based on the tutorial here. First, we will simply use self signed certificates, since it is much easier and faster than using officially signed certificates. We will then place the CA in the cert storage of our linux OS, to make it trust ourself. 😉

mkdir ~/CA
cd ~/CA
# Generate the CA key
openssl genrsa -out ca.key 4096
# Generate the CA certificate, here, you can leave the CN empty
openssl req -new -x509 -key ca.key -days 366 -out ca.crt
# Make it unaccessible by other users
chmod 700 ca.key
# Generate a certificate configuration
wget https://raw.githubusercontent.com/the78mole/scripts/master/templates/configs/ssl/cert.conf -O example.org.conf
# Edit the configuration
vi example.org.conf
# Create a server certificate key and the signing request (not the yet cert)
openssl req -new -out example.org.csr -config example.org.conf
# Create the public key
openssl rsa -in example.org.key -pubout -out example.org.pubkey
# Sign the CSR with your CA and create the certificate
openssl x509 -req -in example.org.csr -CA ca.crt -CAkey ca.key -CAcreateserial -extensions my_extensions -extfile example.org.conf -days 366 -out example.org.crt

To get the alternative DNS names and IPs added to the certificate, you need to specify the config file as an extensions and point to the config section, where the extensions are located. This is because the extensions in the CSR get ignored by openssl when signing and you need to specify it explicitly.

After generating the certificate, you need to import it, where you need it to be accepted (Browser, APT). For testing, it is best to try with a browser. Some tutorial can be found here (it’s german, so use google translator, to read in english) and here. Use the shell of your desktop Debian system.

scp <CA_HOST>:/<PATH_TO_CA>/ca.crt example_ca.pem
sudo cp example_ca.pem /usr/local/share/ca-certificates/
sudo update-ca-certificates

To add the certificate to your browser, e.g. chromium

sudo apt install libnss3-tools
certutil -A -n "Example Company CA" -t "TCu,Cu,Tu" -i example_ca.pem -d ~/.pki/nssdb

Note: Maybe this does not work correctly… Then, in Chromium, use Settings –> Privacy and Security –> Manage Certificates –> Import –> Select the CA –> Check all boxes.

Now we need the CA’s and the server’s certificate along with the server key for securing webserver traffic.

Install and Configure the Webserver

Welp, we will use nginx as our webserver. Feel free to use any other, it does not really matter. In fact, every further step (e.g. the let’s encrypt tutorial) will be based on nginx.

sudo apt install nginx
cd /etc/nginx
cp snippets/snakeoil.conf snippets/ssl_example.org.conf
mkdir -p ssl/pub
mkdir -p ssl/priv
sudo chown -R root:www-data ssl
sudo chmod -R 0755 ssl/pub
sudo chmod -R 0750 ssl/priv
cp ~/CA/ca.crt ~/CA/example.org.crt
cp ~/CA/example.org.key
# Edit the ssl config file to your needs
vi snippets/ssl_example.org.conf
# Now adjust the nginx configuration to use SSL
vi sites-enabled/default
# Ensure following lines are added and not commented out
# listen 443 ssl default_server
# listen [::]:443 ssl default_server
# include snippets/ssl_example.org.conf
service nginx restart

When everything is OK, use your desktop web browser and point it to the location https://example.org. Your should get the page without an error. Thos means, you have setup a CA you can use to sign server certificates and they get trusted.

If you plan to use the Debian package repository on many of your linux hosts, then you should add your CA certificate to the certificate store on all the machines.

Generating GnuPG Key-Pair

To sign a file, email, hash, debian package, repository,… you often need GnuPG. To be able to sign something, you need to first generate your own key, that get trusted from at least the receiving party. All this works again with asymmetric encraption, like the signing of certificates does. An in depth tutorial with links to even deeper knowledge can be found here.

First we should install a tool to gather some entropy, otherwise gnupg may be not able to generate a key on a headless system (no real user input,… –> very few entropy sources).

apt install rng-tools

IF it can not find a hw-rng, you can still try to get randomsound working (if you have a soundcard…)

apt install randomsound

Run this in a seperate window, when gpg is collecting entropy for too long. It will abort after some time, if it can not generate the key.

arecord -l # Do you have any soundcard?
randomsound -v

If all fails, you can still pipe some data into /dev/random to feed the entropy pool, e.g. with (also in a seperate window when gpg gen-key is running.

sudo dd if=/dev/sda of=/dev/random status=progress

You can watch the entropy-pool with:

watch -n 0.5 cat /proc/sys/kernel/random/entropy_avail

To finally generate a GPG-key, simply follow the instructions below:

apt install gnupg
# Create the .gnupg directory easily and add a secure configuration
gpg --list-keys --fingerprint
wget https://raw.githubusercontent.com/the78mole/scripts/master/templates/configs/gnupg/gpg.conf -O ~/.gnupg/gpg.conf
gpg --full-gen-key
# Select:
# Key type : RSA and RSA
# Keysize : 4096
# Expiration: 1y
# Then enter your name and email, but don't include a comment
# Skipping the password makes CI much easier, but less secure...
# It will take some time (maybe minutes) to generate the key

Creating the debian repository (reprepro)

make-debs already created a debian repository, but we will create one, that is more general, also serving well for other software packages. make-deps

…. to be continued …

Coming soon: To add some real NAS features, we could use just another embedded board with e.g. FreeNAS or NextCloud installed to mount the cluster file system and using the cluster as the storage backend. We already have the nginx SSL configured, so we easily can add reverse proxy targets… (for HTTPS-HTTPS-proxy, see here)

Compile Ceph (master) on ARM (32-Bit)

TODO: Test this all on a virgin armhf system (raspberry, odroid hc1/2/XU4,…) and complete the TODOs for openssl and phantomjs (and the sass-dependency). Maybe with the new master tree, it is not needed to build it outside the ceph repo.

First install prerequisites:

sudo apt install python-pip build-essential libgmp-dev \
libmpfr-dev libmpc-dev reprepro

Install nodejs from nodejs.org

curl -sL https://deb.nodesource.com/setup_11.x | sudo -E bash - sudo apt-get install -y nodejs
sudo npm install -g npm

Then prepare a swap partition (you will need it 😉 )

dd if=/dev/zero of=/<some-hdd-path>/swapfile \
bs=1M count=8192 progress=status
mkswap /<some-hdd-path>/swapfile
swapon /<some-hdd-path>/swapfile

Then we should install some dependencies

sudo apt install libgmp-dev libmpfr-dev libmpc-dev ruby

Now install a new GCC that supports C++17.

wget https://ftp.gnu.org/gnu/gcc/gcc-8.2.0/gcc-8.2.0.tar.xz
tar xfJ gcc-8.2.0.tar.xz
cd gcc-8.2.0
./configure # for armhf
# ./configure --disable-multilib # for x86_64/arm64
make

Building ceph with do_cmake, building a debian package with make-debs.sh or simply build packages using another compiler than the debian default one (6.3.0) requires you to change the default compiler e.g. to gcc-8.2.0 for the whole system:

sudo update-alternatives --install /usr/bin/cc cc /usr/local/gcc-8.2/bin/gcc-8.2 50
sudo update-alternatives --install /usr/bin/c++ c++ /usr/local/gcc-8.2/bin/g++-8.2 50

Checkout OpenSSL-1.0.2-stable (seems also necessary for armhf), PhantomJS, compile and install it:

cd /opt/GIT
git clone git@github.com:openssl/openssl.git
cd openssl
git checkout OpenSSL-1_0_2-stable
...TODO...
# Following seems only necessary on arm
# (or all platforms wihtout precompiled binary)
cd /opt/GIT
git clone git@github.com:ariya/phantomjs.git
cd phantomjs
...TODO...
sudo LD_LIBRARY_PATH=/opt/openssl_build_stable/lib/ \
deploy/package.sh --bundle-libs

Add the following to build.py (at L:244, just after PlatformOptions.extend)

phantom_openssl = os.getenv("PHANTOM_OPENSSL_PATH", "")
if phantom_openssl != "":
openssl = os.putenv("OPENSSL_LIBS", "-L" + phantom_openssl + "/lib -lssl -lcrypto")
openssl_include = "-I" + phantom_openssl + "/include"
openssl_lib = "-L" + phantom_openssl + "/lib"
platformOptions.extend([openssl_include, openssl_lib])
print("Using OpenSSL at %s" % phantom_openssl)

Then install it to /opt

Build and compile Ceph

git clone git@github.com:the78mole/ceph.git
cd ceph
git checkout wip-32-bit-arm-fixes
./install-deps.sh
./do_cmake_arm32.sh # for armhf
# ./do_cmake.sh # for x86_64/amd64 or arm64
cd build
make -j4
# if it gets really slow due to swapping, break an do make -j1
# or use the scheduler-script from link below

Here you can find a rudimentary (but working) script that suspends compilers processes based on total compilers memory consumption. Running it through ‚watch‘-tool you can start e.g. 8 tasks and when memory limit is reached, it will suspend (kill -TSPT) the youngest tasks in sense of user space runtime.
https://github.com/the78mole/scripts/blob/master/linux/bash/schedule_compile.sh

Now do…

cd ..   # Back to ceph base dir
./make-debs-arm32.sh # for armhf
# ./make-debs.sh # fox x86_64/amd64 or arm64

If you encounter problems with setuptools (Exception –> TypeError: unsupported operand type(s) for -= ‚Retry‘ and ‚int‘) try to get a more recent version of python pip with the following commands and rerun make-debs-arm32.sh.

apt-get remove python-pip python3-pip
wget https://bootstrap.pypa.io/get-pip.py
python get-pip.py
python3 get-pip.py

If I forgot anything to make it work, feel free to write some comment…

Compiling Software on RAM-limited Multi-Core Systems

Since I often compile stuff on embedded ARM targets that are well equipped with processing power (Exinos Octa-Core), but are neglected regarding RAM (2G), I often facing the trade-off between running multiple or only a single/few compilation jobs (make -j8 vs. make -j1). If you start too many jobs and if you have large compilation units (e.g. with the Ceph Project sources), the system will feel like jam, as soon as it begins swapping.

I feel, that deciding the job count at the very beginning is (was) a trade-off, I was not willing to accept. Therefore, I decided to write a little script to suspend compile processes, that cross a certain memory limit. This way, the suspended processes get moved to swap and the still runnig processes get a comfortable amount of RAM. This way, the kernel is not forced to move pages around with every scheduling round. Instead, it will move it once on swap for suspended processes, when it needs RAM for the running ones and as soon as the processes with large memory footprint finish, the supended ones get back to live.

Welp, I decided to base the priority on the time the processes ate up user space processing time, so the older ones (often the most memory hungy ones) get processed first. This scheduling scheme proofed to be a optimal descision, that is also not hard to implement as a bash script.

Here you can find the little script, that needs to be run within a loop or simply with the watch-tool (maybe with sudo).

watch scripts/linux/bash/schedule_compile.sh

Happy compiling!