Compute servers in the department

Currently, there are two compute servers. This page gives some pointers on how to use them. The computational-load status of the servers is available here.

Setup and SSH keys

You should have a .bash_profile and .bashrc file in your home directory for setting environment variables every time you log on (or similar for other shell types). If you need inspiration, you can look at /home/bovy/.bash_profile and /home/bovy/.bashrc. To avoid backend issues with matplotlib, you should have a file .matplotlib/matplotlibrc in your home directory that sets the backend to Agg. You can download a sample file here and change the backend line to use Agg.

To log on and use some of the other functionality, you should set up an SSH key, such that you can log on without having to type in your password. On your home computer (for example, your laptop which you use to log onto the compute servers or your office computer) do

$ ssh-keygen -t rsa

to generate a key. Use a long passphrase that you will remember. On the compute server you should create a .ssh directory in your home directory

$ mkdir .ssh
$ chmod 700 .ssh
$ cd .ssh

and create an authorized_keys file

$ touch authorized_keys
$ chmod 600 authorized_keys

Log out and copy your public key to the compute server

$ cat ~/.ssh/id_rsa.pub | ssh -l USERNAME SERVER.astro.utoronto.ca 'sh -c "cat - >> ~/.ssh/authorized_keys"'

You will also want to use a .ssh/config file on your home computer to make login in easier. This file has entries such as

Host NICKNAME
     Hostname SERVER.astro.utoronto.ca
     User USERNAME
     ForwardAgent yes

Then you can login simply as

$ ssh NICKNAME

On a Mac Keychain will remember your key’s passphrase, so you won’t have to type it constantly. On Linux there are similar programs.

Installing software

You should install any software locally, including packages such as galpy and apogee. All of the users of this server might be using different (development) versions of this and other software, so it’s easiest if everybody installs locally as much as is practical. Create a local/ directory in your home directory

$ cd
$ mkdir local

and then install software to that directory. Make sure to add your local/bin directory to your path (export PATH=$HOME/local/bin:$PATH in your .bash_profile).For compiled programs this means that you typically do

$ ./configure --prefix=~/local
...

For Python it is recommended that you install Miniconda and manage your own Python installation, rather than using the Python version provided by the OS (which is difficult to keep up-to-date).

Installing FERRE with the apogee package locally doesn’t work (the installer can’t figure out to copy the necessary binaries to $HOME/local/bin), so to install this you need to run the installation with --install-ferre, let it fail, and then copy the ferre and ascii2bin binaries to $HOME/local/bin yourself (and make them executable with chmod u+x ~/local/bin/ferre and chmod u+x ~/local/bin/ascii2bin.

Running NEMO

NEMO is installed on the server. To be able to use it, you can add the following line to your
.bash_profile or .bashrc

if [ -z "$NEMO" ]; then source ~bovy/Repos/nemo/nemo_start.sh; fi

You can also install NEMO locally yourself and then probably want to start from this GitHub version.

Running iPython/Jupyter notebooks/lab

You can run an iPython notebook on the server while manipulating and displaying it on your home computer using the following steps. First, login to the compute server and start a notebook server on some port PORT (PORT should be something like 8889, but choose something different to avoid overlap with other users)

$ jupyter notebook --no-browser --port=PORT

or for Jupyter lab

$ jupyter lab --no-browser --port=PORT

Check that you don’t get an error message that the PORT is already in use

On your local, home computer open an SSH tunnel as follows

$ ssh -N -L localhost:8888:localhost:PORT NICKNAME

where NICKNAME is the same ssh shortcut as above. If you have an IPython notebook running locally you need to use a different local port (not 8888, because that will be used by the local notebook). Then open a tab in your browser and navigate to

localhost:8888

which displays the notebook server that is running remotely. To close the SSH tunnel, just terminate the SSH tunnel running process. You can start the remote notebook in a UNIX screen session. This way you can log out while detaching the screen (thus keeping the process running remotely) and this also safeguards you against losing the connection to the remote server while you are working in the notebook. Note that computations in the notebook will only remain running when you are connected to the notebook using your browser (however, the kernel keeps running if you are disconnected, so when you reconnect, the notebook will still have all of the variables etc. that you defined). Therefore, do not use this to run long computations.

Adding users

Adding users is easily done by

$ sudo adduser NEW_USERNAME

which will ask for a new password which you can create using a service like RANDOM.ORG. This password should be immediately changed by the new user. Make a scratch directory and change the ownership to the new user with

$ sudo chown -R NEW_USERNAME NEW_USERNAME

The above only creates the account on one server. To make the same account on all servers, add the new user using useradd on each new after they setup their password (don’t use adduser, just useradd). Then edit /etc/passwd and /etc/group such that it is the same, and copy the entry between the first colons in /etc/shadow (note that you might have to make /etc/shadow writeable first).

To allow a user to use Docker, add them to the docker group

$ sudo usermod -aG docker NEW_USERNAME

To delete a user do

$ sudo deluser --remove-home NEW_USERNAME

on all servers. Also remove the scratch directory.

The server status page

The server status page is automatically generated, but might need a bit of care to keep going or restart. Essentially, a script exists in $HOME/monitor/monitor.sh that needs to be run on each server. This script uses data collected by sysstat, creates a graph, and copies it to the webserer, where it is then automatically sorted by reverse data and displayed.

To make sure sysstat is running do

sudo systemctl start sysstat

To make sure sysstat is started whenever the server restarts, do

sudo systemctl enable sysstat

To check the status, do

systemctl status sysstat

and to specifically check that the cron job is running, do

systemctl list-timers | grep sysstat

If this is all working, then the data is being collected. All that is then left to is to make sure that the monitor.sh script is run automatically, so add to the crontab

*/10 * * * * /home/bovy/monitor/monitor.sh > /home/bovy/monitor/monitor_SERVER.log 2>&1

where SERVER is the server’s name. Note that everything in this section needs to be run on each server!