Faster Galaxy with uWSGI

I recently switched out local Galaxy server to be run using uWSGI and supervisord instead of the standard (which uses Paste under the hood). I followed the Galaxy scaling guide and it was pretty accurate except for a few details. I won’t be showing the changes to Galaxy config files, they are exactly as related on that page.

I installed supervisord by doing pip install supervisor in the virtualenv that Galaxy uses. Then I put a supervisord.conf in the config/ directory of our Galaxy install and it starts like this:




The [inet_http_server] section directs supervisord to listen on localhost port 9001. The following two sections, [supervisord] and [supervisorctl] need to be present but can be empty. The rest of the configuration is as per that on the Scaling page with a few changes I’ll explain below:

command         = /opt/galaxy/.venv/bin/uwsgi --plugin python --ini-paste /opt/galaxy/config/galaxy.ini --die-on-term
directory       = /opt/galaxy
umask           = 022
autostart       = true
autorestart     = true
startsecs       = 10
user            = galaxy
environment     = PATH=/opt/galaxy/.venv:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin,PYTHON_EGG_CACHE=/opt/galaxy/.python-eggs,PYTHONPATH=/opt/galaxy/eggs/PasteDeploy-1.5.0-    py2.7.egg,SGE_ROOT=/var/lib/gridengine
numprocs        = 1
stopsignal      = TERM

command         = /opt/galaxy/.venv/bin/python ./scripts/ serve config/galaxy.ini --server-name=handler%(process_num)s --pid-file=/opt/galaxy/handler%(process_num) --log-    file=/opt/galaxy/handler%(process_num)s.log
directory       = /opt/galaxy
process_name    = handler%(process_num)s
numprocs        = 2
umask           = 022
autostart       = true
autorestart     = true
startsecs       = 15
user            = galaxy
environment     = PYTHON_EGG_CACHE=/opt/galaxy/.python-eggs,SGE_ROOT=/var/lib/gridengine

The SGE_ROOT is necessary because our cluster uses Sun Grid Engine and the SGE DRMAA library requires this environment variable. Otherwise this config uses uWSGI installed (using pip) in the virtualenv that Galaxy uses.

This snipped of nginx configuration shows what was commented out and what was added to link nginx to uWSGI:

#proxy_set_header REMOTE_USER $remote_user;
#proxy_set_header X-Forwarded-Host $host;
#proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
#proxy_set_header X-URL-SCHEME https;
#proxy_pass http://galaxy_app;
uwsgi_param UWSGI_SCHEME $scheme;
include uwsgi_params;

Then, how to start and stop it all? Firstly, the supervisord config. The basis for this was the debian-norrgard script from the supervisord initscripts repository. The final script is in this gist. Note these lines:


They link supervisord to Galaxy settings. Then /etc/init.d/galaxy is in this gist. It depends on the supervisord startup script and starts and stops Galaxy using supervisorctl.

Two things remain unsatisfactory:

  1. The shutdown of Galaxy doesn’t work reliably. The use of uWSGI’s --die-on-term and stopsignal = TERM in the supervisord.conf is an attempt to remedy this.

  2. The uWSGI config relies on the PasteDeploy egg. This exists on our Galaxy server because it was downloaded by the historical Galaxy startup script. With the switch towards wheel based (instead of egg based) packages, this script is no longer part of a Galaxy install. The uWSGI settings might need to be changed because of this, however, the PasteDeploy package is installed in the virtualenv that Galaxy uses, so perhaps no change is necessary. I haven’t tested this.

With these limitations, however, our Galaxy server is working and much more responsive than before.

How to submit a job to the SANBI computing cluster

I keep trying to finish the documentation about our computing cluster at SANBI and SGE (Sun Grid Engine)1 and how to run jobs. In the meantime, however, here’s how to run a job on the cluster at SANBI.

Firstly, the structure of the cluster. Our storage, for now, is provided by a storage server and shared across the whole cluster. This means that your home directory and the /cip0 storage area is shared across the whole cluster. We still need to implement better research data management practices but you should do your work in a scratch directory and store your results in your research directory. I’m not going to talk more about that now because the system is in flux.

Secondly, the cluster has a number of compute nodes and a single submit node. The submit node is, so log in there to submit your job. It is a smallish virtual machine, so don’t run anything substantial on the submit node!

So lets’s imagine that you want to run a tool like fastqc on the cluster. First, is the tool available? We use a system called environment modules to manage the available software. This allows us to install software in a central place and just add the relevant environment variables to run the tool you need. The module avail command lists available commands, so module avail 2>&1 |grep fastqc will show us that fastqc is indeed available.

Next we need a working directory (e.g. /cip0/scratch/pvh) and a script that will run the command. Here is a little script (lets imagine it is called


. /etc/profile.d/

module add fastqc

if [ ! -d out ] ; then
    mkdir out

fastqc -t 1 -out `pwd`/out -f fastq data.fastq

The . /etc/profile.d/ ensures that the module command is available. The module add fastqc adds the path to fastqc so that it is available to our script. You can use module add outside a script if you want to examine how to run a command. It also sets a variable, FASTQC_HOME, pointing to where the command is installed, so you can ls $FASTQC_HOME to see if there is perhaps a README or other useful data in that directory.

Then fastqc needs you to create the output directory, it won’t do it itself, so the script does that, creating a directory under the current directory, named out. Now you need to send the script to the cluster’s scheduler:

qsub -wd $(pwd) -q all.q -N myfastqc

This will send to the scheduler and tell it to run it on all.q (which happens to be the default queue) with the job named myfastqc. This queue has a time limit of 8 hours, so if you need to run for longer than that you need a -q long.q. The long.q queue has no time limit but fewer CPUs available. The -wd flags sets the job’s working directory, in this example the directory you are in when you submit the job.

You can check the status of your job with qstat. Job output is, by default, written into the working directory in two files, one for stderr and the other for stdout.

There’s much more to say about the cluster and the use of the qsub, qstat, qacct and qhost commands, but this should be enough to get you started with your first cluster job. The rest will have to wait till I’ve got time to write more extensive documentation.


Sun Grid Engine is part of the Grid Engine family of job schedulers that has undergone complex evolution over the last years due to Sun’s takeover by Oracle and the subsequent forking of the codebase. See the Grid Engine Wiki for details. Back to text