Skip to content
Related Articles

Related Articles

MPI – Distributed Computing made easy

View Discussion
Improve Article
Save Article
  • Difficulty Level : Easy
  • Last Updated : 11 Jul, 2022

The Underlying Problem

To make things easier, let’s directly jump to some statistics:

  • Facebook, currently, has 1.5 billion active monthly users.
  • Google performs at least 1 trillion searches per year.
  • About 48 hours of video are uploaded on Youtube every minute.

With such high demand, I do believe that a single system would be unable to handle the processing. Thus, comes the need for Distributed Systems.

What is Distributed Computing?

A distributed system consists of a collection of autonomous computers, connected through a network and distribution middleware, which enables computers to coordinate their activities and to share the resources of the system so that users perceive the system as a single, integrated computing facility.

Let us say about Google Web Server, from users perspective while they submit the searched query, they assume google web server as a single system. However, behind the curtain, Google has built a lot of servers which is distributed (geographically and computationally) to give us the result within a few seconds.

Advantages of Distributed Computing?

  • Highly efficient
  • Scalability
  • Less tolerant of failures
  • High Availability

Let us look at an example where we save computational time by using distributed computing. 

For eg., If we have an array, a, having n elements, a=[1, 2, 3, 4, 5, 6]

We want to sum up all the elements of the array and output it. Now, let us assume that there are 1020 elements in the array and the time to compute the sum is x.

If we now divide the array in 3 parts, a1, a2 and a3 where a1 = { Set of elements where modulo(element from a) == 0 } a2 = { Set of elements where modulo(element from a) == 1 } a3 = { Set of elements where modulo(element from a) == 2 }

We will send these 3 arrays to 3 different processes for computing the sum of these individual processes. On average, let’s assume that each array has n/3 elements. Thus, the time taken by each process will also reduce to x/3. Since these processes will be running in parallel, the three “x/3” will be computed simultaneously and the sum of each array is returned to the main process. In the end, we can compute the final sum of a by summing up the individual sum of the arrays: a1, a2, and a3.

Thus, we are able to reduce the time from x to x/3, if we are running the process simultaneously. What is MPI?

Message Passing Interface (MPI) is a standardized and portable message-passing system developed for distributed and parallel computing. MPI provides parallel hardware vendors with a clearly defined base set of routines that can be efficiently implemented. As a result, hardware vendors can build upon this collection of standard low-level routines to create higher-level routines for the distributed-memory communication environment supplied with their parallel machines.

MPI gives users the flexibility of calling a set of routines from C, C++, Fortran, C#, Java, or Python. The advantages of MPI over older message passing libraries are portability (because MPI has been implemented for almost every distributed memory architecture) and speed (because each implementation is in principle optimized for the hardware on which it runs)

The advantages of MPI over other message-passing framework is portability and speed. It has been implemented for almost every distributed memory architecture and each implementation is in principle optimized for the hardware on which it runs.

Even though there are options available for multiple languages, Python is the most preferred one due to its simplicity, and ease of writing the code. So, now, we will now look at how to install MPI on ubuntu 14.10. 

Install MPI on Ubuntu 

1) Step No. 1: Copy the following line of code in your terminal to install NumPy, a package for all scientific computing in python.

sudo apt-get install python-numpy

2) After successful completion of the above step, execute the following commands to update the system and install the pip package.

sudo apt-get update
sudo apt-get -y install python-pip

3) Now, we will download the doc for the latest version of the MPI.

sudo apt-get install libcr-dev mpich2 mpich2-doc

4) Enter the command to download MPI using pip for python

sudo pip install mpi4py

MPI is successfully installed now. Sometimes, a problem might pop up while clearing up the packages after MPI has been installed due to the absence of dev tools in python. You can install them using the following command:

sudo apt-get install python-dev

  MPI on Windows/MAC 

For Windows/MAC user, they can visit the following link and download the .zip file and unzip and execute it: 

MPI framework

Tutorials

Following installation, you can refer to the following documentation for using MPI using python.

https://mpi4py.scipy.org/docs/usrman/tutorial.html 

References https://www.open-mpi.org/ https://en.wikipedia.org/wiki/Message_Passing_Interface 

About the Author: Anurag Mishra is currently a 3rd-year B.Tech student is an avid software follower and a full stack web developer. His keen interest lies in web development, NLP, and networking. 

If you also wish to showcase your blog here, please see GBlog for guest blog writing on GeeksforGeeks.

My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!