Creating an MPI Cluster

Last Updated : 07 Feb, 2022

In this article, we will discuss how to create an MPI Cluster.

To set up a cluster in the local environment, the same versions of the OpenMPI should be pre-installed in every system.

Prerequisites

Operating System: The Operating System is Ubuntu 18.04.
MPI: We could either use OpenMPI or MPICH. This tutorial follows OpenMPI (version 2.1.1). List Open MPI available versions

apt list -a openmpi-bin
sudo apt-get install openmpi-bin //To install open-mpi

Steps to Create an MPI Cluster

Step 1: Configure your hosts file

We are going to communicate between the computers and we don’t want to type in the IP addresses every so often. Instead, we can give a name to the various nodes in the network that we wish to communicate with. hosts file is used by the device operating system to map hostnames to IP addresses.

Example of a host file in master.

sudo nano /etc/hosts

Add these host IPs and Worker IPs in that file :

#MPI CLUSTERS
172.20.36.120 manager
172.20.36.153 worker1
172.20.36.143 worker2
172.20.36.116 worker3

For worker (slave) node

Example of a host file for worker2

#MPI CLUSTER SETUP
172.20.36.120   manager
172.20.36.143   worker2

Step 2: Create a new user

We can operate the cluster using existing users. It’s better to create a new user to keep things simpler. Create new user accounts with the same username in all the machines to keep things simple.

To add a new user:

sudo adduser mpiuser

Making mpiuser a sudoer :

sudo usermod -aG sudo mpiuser

Step 3: Setting up SSH

Machines are going to be talking over the network via SSH and share data via NFS. Follow the below process for both manager and the worker nodes.

To install ssh in the system.

sudo apt-get install openssh-server

su - mpiuser

Navigate to ~/.ssh folder and

ssh-keygen -t rsa
cd .ssh/
cat id_rsa.pub >> authorized_keys
ssh-copy-id worker1

For example:

mpiuser@tele-h81m-s138:~/.ssh$ ssh-copy-id worker2

Now you can connect to worker nodes without entering passwords

ssh worker2

In worker nodes use

ssh-copy-id manager

Step 4: Setting up NFS

We share a directory via NFS in the manager which the worker mounts to exchange data.

NFS-Server for the master node :

Install the required packages by

$ sudo apt-get install nfs-kernel-server

We need to create a folder that we will share across the network. In our case, we used “cloud”. To export the cloud directory, we need to create an entry in /etc/exports

sudo nano /etc/exports

Add

/home/mpiuser/cloud *(rw,sync,no_root_squash,no_subtree_check)

Instead of *, we can specifically give out the IP address to which we want to share this folder, or we can use *.

For Example:

/home/mpiuser/cloud 172.20.36.121(rw,sync,no_root_squash,no_subtree_check)

After an entry is made, run the following.

$ exportfs -a

Run the above command, every time any change has been made to /etc/exports.

Use sudo exportfs -a if the above statement doesn’t work.

If required, restart the NFS server

$ sudo service nfs-kernel-server restart

> NFS-worker for the client nodes

Install the required packages

$ sudo apt-get install nfs-common

Create a directory in the worker’s machine with the same name – “cloud”

$ mkdir cloud

And now, mount the shared directory like

$ sudo mount -t nfs manager:/home/mpiuser/cloud ~/cloud

To check the mounted directories,

$ df -h

This is how it would show up

To make the mount permanent so you don’t have to manually mount the shared directory every time you do a system reboot, you can create an entry in your file systems table – i.e., /etc/fstab file like this:

$ nano /etc/fstab

Add

#MPI CLUSTER SETUP
manager:/home/mpiuser/cloud /home/mpiuser/cloud nfs

Step 5: Running MPI programs

Navigate to the NFS shared directory (“cloud” in our case) and create the files there[or we can paste just the output files). To compile the code, the name of which let’s say is mpi_hello.c, we will have to compile it the way given below, to generate an executable mpi_hello.

$ mpicc -o mpi_hello mpi_hello.c

To run it only in the master machine, we do

$ mpirun -np 2 ./mpi_helloBsend

np – No. of processes = 2

To run the code within a cluster

$ mpirun -hostfile my_host ./mpi_hello

Here, the my_host file determines the IP Addresses and number of processes to be run.

Sample Hosts File :

manager slots=4 max_slots=40
worker1 slots=4 max_slots=40
worker2  max_slots=40
worker3 slots=4 max_slots=40

Alternatively,

$ mpirun -np 5 -hosts worker,localhost ./mpi_hello

Note: Hostnames can also be substituted with IP addresses.

Suggest improvement

An Overview of Cluster Computing

Share your thoughts in the comments