Installing Slurm on CentOS using Ansible

Helping the UWC Student Cluster Challenge team prepare for the final round (at the CHPC National Meeting) has given me an excuse to play with some new toys: I’ve got a mini-cluster of three VMs running on my laptop (using KVM and libvirt), and I’ve been looking into Slurm as a cluster scheduler. At SANBI we run SGE, lots of other people use Torque, but I’ve been interested in Slurm for a while, because its a fully open source scheduler with some big name users and seemingly a bright future. Then, I’m big into systems administration task automation: at SANBI we use puppet (and I personally use fabric), but Bruce Becker introduced me to Ansible, and so I took the opportunity to build an Ansible playbook to install Slurm on my mini-cluster.

Ansible playbooks are written in YAML and describe a set of tasks that need to be applied to a set of servers. These tasks are defined in terms of a set of modules and executed using ssh  (optionally using ZeroMQ to speed up data transfer) so you need to have ssh access to the machines you want to administer. I did this by adding my ssh key to the authorized_keys file for the root users on my mini-cluster. Like puppet recipes, ansible playbooks are (largely) declarative: you specify what you want, not how to achieve it. Unlike puppet recipes, ansible playbooks run tasks in order, first to last.

So my cluster has three nodes: head (the head node), and two workers: worker1 and worker2. These are on a private (virtual) LAN with DNS being provided by the head node (so the DNS names are head.cluster, etc). My laptop is the VM host and the IP addresses of all nodes are in /etc/hosts on the machine. The laptop has Ansible 1.4 installed from Rodney Quillo’s PPA. This is crucial: I use a bunch of features that are only available in 1.4.

The Slurm Quick Start Administrator Guide outlines the steps needed to install Slurm in a general way. First, I downloaded Slurm 2.6.4 and then installed the dependencies I needed to compile it:

openmpi-devel pam-devel hwloc-devel rrdtool-devel ncurses-devel munge-devel

This is not an exhaustive list: I had previously installed software on the nodes, so there might be stuff I left off this list. To identify missing dependencies, look at the config.log after the configure stage and search for WARNING messages. I unpacked Slurm and did:

./configure --prefix=/opt/slurm
make
sudo make install

This installed Slurm in /opt/slurm. I then created an archive of the slurm install:

cd /opt
tar jcf /var/tmp/slurm-bin.tar.bz2 slurm/*

I deployed this with Ansible to the nodes in the cluster. My Ansible setup uses a hostfile (/etc/ansible/hosts) that defines the hosts and host groups:

[workers]
worker1.cluster
worker2.cluster
[head]
head.cluster
[cluster:children]
head
workers

(You don’t need to use a system-wide hosts file like I did, you can specify an alternative hostfile with the -i flag on the ansible-playbook command line.) This was my initial ansible playbook (slurm.yml):

---
- hosts: cluster
 remote_user: root
 tasks:
 - name: create slurm user
 user: name=slurm createhome=no home=/opt/slurm 
 shell=/sbin/nologin state=present
 - name: install slurm dependencies
 yum: name={{item}} 
 with_items:
 - pam 
 - hwloc 
 - rrdtool 
 - ncurses 
 - munge
 - name: create slurm directories
 file: path=/var/spool/slurmd owner=slurm mode=0755 state=directory
 - name: copy slurm binaries to /tmp
 copy: src=slurm-bin.tar.bz2 dest=/tmp/slurm-bin.tar.bz2
 - name: unpack slurm binary distribution
 command: /bin/tar jxf /tmp/slurm-bin.tar.bz2 chdir=/opt 
 creates=/opt/slurm
 - name: install slurm configuration file
 copy: src=slurm.conf dest=/opt/slurm/etc/slurm.conf
 notify: restart slurm
 - name: install slurm path file in /etc/profile.d
 copy: src=slurm.sh dest=/etc/profile.d/slurm.sh mode=0755 owner=root
 - name: install slurm started script in /etc/init.d
 copy: src=init.d.slurm dest=/etc/init.d/slurm mode=0755 owner=root
 - name: enable munge service
 service: name=munge state=started enabled=yes
 - name: enable slurm service startup
 service: name=slurm state=started enabled=yes
 handlers:
 - name: restart slurm
 service: name=slurm state=restarted

As mentioned previously, ansible playbooks are read top down. So the steps taken are:

  1. Create slurm user.
  2. Install slurm dependencies (the non-devel versions of the previously mentioned packages).
  3. Create slurm spool directory (/var/spool/slurmd) and make it owned by the slurm user.
  4. Upload and unpack the slurm-bin.tar.bz2 that was previously created.
  5. Install the slurm configuration file to /opt/slurm/etc/slurm.conf. The first draft of this was created with the Slurm 2.6 configuration tool and the final version is here.
  6. Install the slurm.sh script to /etc/profile.d that sets the PATH to include slurm binaries. This file contains:

    PATH=$PATH:/opt/slurm/bin:/opt/slurm/sbin

    if [ -z “$MANPATH” ] ; then
    MANPATH=/opt/slurm/share/man
    else
    MANPATH=$MANPATH:/opt/slurm/share/man
    fi
    export PATH MANPATH 

  7. Install init.d.slurm from slurm distribution’s etc/ directory to /etc/init.d/slurm. This handles start/stop of both slurmctld (on head) and slurmd (on worker nodes).
  8. Ensure that the munge daemon is started. I have previously generated a munge key using the instructions on the munge website and distributed this to /etc/munge/munge.key on each of the nodes in the cluster. This was done with another ansible playbook (not shown).

  9. Once the file is installed the service is enabled (with something like chkconfig slurm on) and the slurm daemons are started (service slurm start).
  10. The playbook was then split up into Ansible roles and roles were added to create a NFS server on the head node and share /home from the head node and mount it over /home on the worker nodes.

Once this was all up and running, the system was tested by using sbatch to run a simple script. Here’s the script, hello.sh:

#!/bin/sh

echo Hello World

This was submitted with the command:

sbatch hello.sh

After that worked the AMG benchmark was run using MPI. Here’s the test_amg.sh script:

#!/bin/sh
echo NTASKS $SLURM_NTASKS
LD_LIBRARY_PATH=/usr/lib64/openmpi/lib
export LD_LIBRARY_PATH
mpirun src/test/amg2013 -laplace -P 1 1 $SLURM_NTASKS -n 64 64 64 -solver 2

and run using:

sbatch -n 2 test_amg.sh

Compared to my experience with SGE, Slurm seems to run jobs really fast and compared to Torque+Maui it seems pretty easy to set up.

As mentioned above I switched over my playbook to using Ansible roles. Roles allow you to split out the components of configured into a particular directory structure and then mix these into your final playbook. So the roles structures I currently have is:

roles
├── munge
│   ├── files
│   │   └── munge.key
│   ├── handlers
│   │   └── main.yml
│   └── tasks
│       └── main.yml
├── nfs-client
│   └── tasks
│       └── main.yml
├── nfs-common
│   └── tasks
│       └── main.yml
├── nfs-server
│   ├── files
│   │   └── exports
│   ├── handlers
│   │   └── main.yml
│   └── tasks
│       └── main.yml
└── slurm
    ├── files
    │   ├── init.d.slurm
    │   ├── slurm-bin.tar.bz2
    │   ├── slurm.conf
    │   └── slurm.sh
    ├── handlers
    │   └── main.yml
    └── tasks
        └── main.yml

Effectively what Ansible roles do is to split the sections of your playbook out into a directory structure. This is then used in the final playbook (slurm.yml):

---
- hosts: cluster
 remote_user: root
 roles:
 - munge
 - slurm
 - nfs-common
 tasks:
 - name: disable firewall
 service: name=iptables enabled=no state=stopped
- hosts: head
 remote_user: root
 roles:
 - nfs-server
- hosts: workers
 remote_user: root
 roles:
 - nfs-client

And finally I’m at a stage where I can run:

ansible-playbook slurm.yml

And have the complete infrastructure required for a Slurm install set up on my virtual cluster.
[edited to add whitespace to Ansible playbooks as per suggestion from Michael de Haan @laserllama]