Saturday, 30 July 2016

Creating a Raspberry Pi 3 Cluster - "Supercomputer", for parallel computing.

In this quick article I will show you how to create your own Raspberry Pi cluster for parallel computing via MPI (Messaging Passing Interface) library. This is a nice summer project now that I'm free from my Master's duties until September and I have been wanting to build this for a while. Thanks to the low prices of the Raspberry Pi we are now able to build this without spending too much. See below for the list of items you will need and price for the whole kit with 4 Pi's. 

The main decision behind this architecture is to choose which operating system and programming language to use to implement parallel computing. Because of my experience with HPC (High Performance Computing) and SGE (Sun Grid Engine) the best way to achieve this is by using either OpenMPI or MPICH3. These two are free open distributions, portable and very popular. As per the programming language, we have several alternatives: we could use c++, c#, python, etc. I could use winIoT for Rpi or simply a Linux distro. I'm a geek so I went for the latter as I do like interacting with command line interfaces, there is some beauty there that I can't explain :). 
So my decision is to use a Linux distribution as OS. In this case I'm choosing Raspbian Jessie which comes with some goodies installed by default and it will allow me to install all the components I need for my little project.

The second decision to make is to choose the programming language. In this case I'm choosing Python as I'm very familiar with it, it has plenty of libraries available and a nice integration with MPI via mpi4py library.

The other factor to take into account here is that I have two different models of RPi and I need to make sure that whatever I install on those will work well for both instances. I won't be able to install WinIoT to my old Rpi model A.

Building the cluster of Rpi's

The material that you will need is listed below with links included:

4 x Rpi 3 model B = 4 x £30 = £120
4 x 16Gb microSD card (Kingston) = 4 x £4.84 = £19.36
4 x USB to Micro USB Cable 0.5m = 4 x £0.88 = £3.5
1 x 5 port desktop switch = 1 x £6.49 = £6.49
5 x Ethernet patch cable 0.3m = 5 x £2.90 = £14.5
1 x USB Hub = 1 x £2.53 = £2.53

Total =  £192.38 (without considering delivery)

*This is a common configuration but you can start with just 2 or 3 RPi's and keep adding hardware later on.


Once all the components are assembled using the stackable case you should have something like the image below:


Below the image of my cluster up and running (see configuration section for more):


Configuring your cluster of RPi's


The idea is to configure one of the RPi's and then just clone the SD card and plug it to the next Rpi. Here you'll find a summary description of the steps to do to get you up and running:

Installing the OS

  • Download Raspbian Jessie image. I had some trouble downloading the zip file so I used the torrent link instead. See the version used below (4.4)
Once the OS image is downloaded, burn it to the SD card using Win32DiskImager:

Plug the microSD card to the first Pi (my PiController in my case) and power it up. Plug the Ethernet cable and head back to your computer to access the Pi remotely.

Open a command prompt (I'm using Win10 as my main computer) and type "ping raspberrypi". By default the Rpi's are named raspberrypi so they are easy to spot in your network. Once you ping it, you will be able to see the ip address of the device. Save this IP address for later as we will use it in PuTTY.

Launch PuTTY and type the IP address of the RaspberryPi:

You should see something similar to the image below:

login as: pi and password: raspberry (each Rpi uses same login/password)

Type: sudo raspi-config to configure our device:
  1. Go to Expand File System
  2. Go to Advanced Options -> HostName -> set it to PiController
  3. Go to Advanced Options -> MemorySplit -> set it to 16.
  4. Go to Advanced Options -> SSH -> Enable.
  5. Finish and leave the configuration.

Now we can start installing MPICH3 and MPI4PY. Notice that these steps take a while (> 4h) so arrange some free time for this beforehand:

Installing MPICH3


Follow the steps below to install version 3.2 of MPICH:
Once you've got everything installed you should see something like the image below:

Installing MPI4PY


Follow the steps below to install version 2.0 of MPI4PY:
once installed you should see something like the image below:

Now we have finished configuring the first RPi. Believe it or not if you reach this step and everything is working you should be proud of it. Now we will have to clone this SD card and place them into the other RPi's.

Preparing the other RPi's


As mentioned in the step above, bring the SD card to your main computer and save the content of the SD card using Win32DiskImager. Now copy this new image to the other SD cards. You should have now 4 SD cards with the same image. As now we have 4 cloned SD cards, my advice is to plug every Rpi individually and change the host name of every new added Rpi into the network, e.g. pi01, pi02, pi03, etc.

Do the following for every new RPi added into the network:

pi01:

scan the network for a newly added device to find its IP address using a network scanner. Once found, use PuTTY to access it and use the commands below to set it up:

Type: sudo raspi-config to configure our device:
  1. Go to Expand File System
  2. Go to Advanced Options -> HostName -> set it to pi01
  3. Go to Advanced Options -> MemorySplit -> set it to 16.
  4. Go to Advanced Options -> SSH -> Enable.
  5. Finish and leave the configuration.
  6. sudo reboot.
Do the same for pi02 and pi03. Note that you can name your RPis the way you want.

Once done you should be able to see them all 4 using PuTTY:

Once completed, each Rpi will have its own IP. We need now to store each IP address into a host file also known as machinefile. This file contains the hosts which to start the processes on.

Go to your first RPi and type:

nano machinefile

and add the following IP addresses: (Note that you will have to add your own):
This will be used by the MPICH3 to communicate and send/receive messages between various nodes.

Configuring SSH keys for each RPi


Now we need to be able to command each RPi without using users/passwords. To do this we will have to generate SSH keys for each RPi and then share each key to each device under authorised devices. This is the way MPI will be able to talk to each device without worrying about credentials. This process is a bit tedious but once completed you will be able to run MPI without problems.


Run the following commands from the first Pi (PiController):
When running the ssh-keygen just hit enter (if you don't want to add specific passphrase) and the RSA key will be generated for you automatically.

Now we have configured the link between PiController to every single device but we still need to configure the other way around. So you will have to run the following commands from every individual device:

open the authorized_keys files and you will see the additional keys there. Each authorized_keys file on each device should contain 3 keys (as stated in the architecture diagram above).

Now the system is ready for testing.

Note that if your IP address changes, the keys will not be valid and the steps will have to be repeated. 

Testing the cluster


At this point I will just include a small example for you to test that the cluster works as expected. Later on I will publish a more complex scenario with a refined configuration to maximise the power of the cluster.

If everything is configured correctly, the following command should work correctly:

mpiexec -f machinefile -n 4 hostname


You can see that each Device has replied back and every key is used without problems.

Now run the following command to test a helloworld example:

mpiexec -f machinefile -n 4 python /home/pi/mpi4py-2.0.0/demo/helloworld.py

You should see something like the image below:

Now our system is ready to take any parallel computing application we want to develop.

Watch this space for more!.

Next steps


I will be creating more complex scenarios and squeezing the architecture to test its limits. Soon more!. Give it a go and let me know if you face any problem during the set up.

Jordi

41 comments:

  1. I'll be interested to see how you get on with this, especially what kind of performance you get.

    ReplyDelete
    Replies
    1. Hi Herbert,

      Sure, I'm still in the coding phase testing the cluster. Soon I'll share my results.

      Cheers,
      Jordi

      Delete
  2. Great article. I had built one using 70 Raspberry Pis as my final year college project but they were made from Raspberry pi B+ models and we faced a lot of issues: sourcedexter.com/6-common-errors-when-building-a-raspberry-pi-supercomputer/ . However, we succeeded in building it and also ran a multi-document text summarization algorithm on it.

    ReplyDelete
  3. Thank you for this tutorial.

    I'm having a problem with the line

    scp 192.168.1.74:/home/pi/.ssh/PiController

    when I try to use that commmand I end up with

    usage: scp [-12346BCpqrv] [-c cipher] etc. etc.

    Any advice?

    Thanks!

    ReplyDelete
    Replies
    1. I forgot the . at the end, it works when you include it.

      scp 192.168.1.74:/home/pi/.ssh/PiController .

      Delete
  4. Good tutorial, hi i got a problem when i write the command ping raspberrypi, and it doesn't recognized this, any advice

    ReplyDelete
    Replies
    1. you mean that it doesn't recognise the ping command? you might have to make sure your system32 folder is part of your path.

      Delete
  5. The article you have shared here very awesome. I really like and appreciated your work. I read deeply your article, the points you have mentioned in this article are useful

    ReplyDelete
  6. Hi Jordi, very interested in this and keen to get started. The goal is less about parallel computing and more about having GNS3 on a cluster for my CCNA studies. I found this https://www.gns3.com/discussions/the-worlds-first-gns3-beowulf-cl and have had all kinds of dramas getting MPICH and MPI4PY installed. So I'll give this a go. If I get GNS 3 installed as a step just prior to the image creation/duplication.. would you expect any/many issues? I've posted questions on Jason's forum but alas he seemed to have dropped off the face of the earth in 2015 lol

    ReplyDelete
    Replies
    1. Hi Michael,

      I don't expect many issues but you'll have to check the version and its dependencies as GNS 3 might require a different version that I used during my testing.

      Cheers,
      Jordi

      Delete
  7. Hi, I get to the step where I install MPI4PI and this throws a spanner in the works: python setup.py build. It gives:
    Assembler messages:
    Fatal error: can't create build/temp.linux-armv7l-2.7/src/MPI.o: Permission denied
    error: command '/home/rpimpi/mpi-install/bin/mpicc' failed with exit status 1

    ReplyDelete
    Replies
    1. weird, are you running as root? Just make sure that you have permissions to write. It sounds like you don't have enough permissions there.

      cheers,
      Jordi

      Delete
    2. Hi dude, yeah I just ch pwned that the mpi4py-2.0.0 and it all went swimmingly.

      Next step is to get GNS3 on one, then continue the rest.. then copy the image to the others and see how we go

      Delete
  8. Hello Jordi at what point should I plug my other pis into my network switch ?

    ReplyDelete
    Replies
    1. Once you have cloned the SD cards, you can just keep plugging them in one by one and change the IP address on each one prior to connecting the next one. That should do.

      Cheers,
      Jordi

      Delete
  9. I know this is kind of unrelated to the mpich thing [got that bit working].. Anyone know how/where to get qemu-kvm for armhf from? I've tried many places and sudo apt-get instal qemu-kvm [prereq for gns3-server] i get:

    Package qemu-kvm is not available, but is referred to by another package.
    This may mean that the package is missing, has been obsoleted, or
    is only available from another source

    E: Package 'qemu-kvm' has no installation candidate

    Any ideas?

    ReplyDelete
    Replies
    1. I don't know if this package is available for ARM. Have you tried kvmtool? It's an alternative: https://github.com/clearlinux/kvmtool

      Delete
  10. hi.. i got a small problem. all works fine exept this : when i plug a usb disk or other on node 2,3 .. master node dosen't will see it. is there a fix for it ?
    sorry for my english but i live in switzerland and i never has learn it...
    thanks for response

    ReplyDelete
    Replies
    1. Hi Cristinat,

      this might be a problem with that USB, you might have to enable the USB form the Pi.

      Cheers,
      Jordi

      Delete
  11. can i use this to run SETI@home (BOINC).
    Does the BOINC Client use all 4 Pis?

    ReplyDelete
  12. can i use this to run SETI@home (BOINC).
    Does the BOINC Client use all 4 Pis?

    ReplyDelete
    Replies
    1. Hi Peter,

      I don't know if BOINC is able to get the cluster altogether. You might have to run 4 individual BOINCs and then find a way to make the hardware available in some way.

      Cheers,
      Jordi

      Delete
  13. Hello I've been going over this tutorial with 3 of my Raspberry Pis. I'm having an issue with authorized_keys. I have created them for each pi and each pi knows the key for the others. I've confirmed the keys are identical just in case.

    However, when checking if the configuration was successful with mpi, PiController works, but Pi01 and Pi02 are access denied.

    Was wondering if something was missing or perhaps I did something wrong?

    ReplyDelete
    Replies
    1. Never mind. I realized I used the PiController, Pi01 and Pi02 files, not the .pub files

      Delete
    2. Cool that you could solve it. Usually it's something like that as it's quite straightforward.

      Cheers,
      Jordi

      Delete
  14. For mpiexec -f machinefile -n 4 hostname I get...

    PiController
    Pi01

    But Pi02 is never reached and eventually times out. However I can ssh to it through PiController and Pi02 can ssh to PiController.

    Any help in regards to this?

    ReplyDelete
  15. Hey, This is a great tutorial, absolutely love it, thanks man

    ReplyDelete
    Replies
    1. Thanks, really happy you found it useful!

      Cheers,
      Jordi

      Delete
  16. Hello Jordi,
    First of all, congrats for this post !
    I have a project with a friend. Building a cluster of computer with an x86 architecture and we w'de like the Rpi as frontend. Do you think this might work ? If I install debian with mpich3 and openmpi packages on the computers ?

    Thank you in advance for your answer.
    Best regards

    ReplyDelete
  17. Hello Jordi,
    Congrats for your post !
    We have a project with a friend. To build a cluster of computers but we would like the raspberry pi 3 as frontend. Do you think it might be possible ?
    If we install the two packages mpich3 and openmpi ?

    Thank you in advance for your answer !
    Cyril

    ReplyDelete
    Replies
    1. Hi Cyril,
      Yes, it should work.

      Cheers,
      Jordi

      Delete
  18. Hi, great article and would love to put something like this together. Wanted to point out that the ssh keys aren't tied to IPs so you can reuse the keys even if IP changes. If it didn't work for you, it may have been due to the Host ssh keys changing. This happens when you reinstall the OS, but the individual user keys shouldn't be affected. Just make sure to clear out ~pi/.ssh/known_hosts file and you'll be good to go.

    ReplyDelete
  19. Hi
    Really an excellent tutorial. Thank you very much.

    ReplyDelete
  20. Hi
    Thank you for this tutorial. I followed every step and everything worked fine until the final cluster test. When i try to execute mpiexec -f machinefile -n 4 hostname i'm getting following error. Do you have any idea what i did wrong?
    [mpiexec@PiController] HYDU_parse_hostfile (/home/pi/mpich3/mpich-3.2/src/pm/hydra/utils/args/args.c:319): unable to open host file: machinefile
    [mpiexec@PiController] mfile_fn (/home/pi/mpich3/mpich-3.2/src/pm/hydra/ui/mpich/utils.c:336): error parsing hostfile
    [mpiexec@PiController] match_arg (/home/pi/mpich3/mpich-3.2/src/pm/hydra/utils/args/args.c:152): match handler returned error
    [mpiexec@PiController] HYDU_parse_array (/home/pi/mpich3/mpich-3.2/src/pm/hydra/utils/args/args.c:174): argument matching returned error
    [mpiexec@PiController] parse_args (/home/pi/mpich3/mpich-3.2/src/pm/hydra/ui/mpich/utils.c:1596): error parsing input array
    [mpiexec@PiController] HYD_uii_mpx_get_parameters (/home/pi/mpich3/mpich-3.2/src/pm/hydra/ui/mpich/utils.c:1648): unable to parse user arguments
    [mpiexec@PiController] main (/home/pi/mpich3/mpich-3.2/src/pm/hydra/ui/mpich/mpiexec.c:153): error parsing parameters

    Thanks in advance
    Salvi

    ReplyDelete
    Replies
    1. Hi Salvi,
      Are you running this with elevated permissions?

      Cheers,
      Jordi

      Delete
  21. Hello Jordi
    Congratulations a great article, install the cluster without problems.
    I would like to know if the c and c ++ libraries can be integrated into the cluster
    You could indicate me where to download other examples to run them in the cluster
    Thank you

    ReplyDelete
    Replies
    1. Hi Armando, I don't know which libraries are you referring to.

      Cheers,
      Jordi

      Delete