Creating a Raspberry Pi 3 Cluster - "Supercomputer", for parallel computing.
In this quick article I will show you how to create your own Raspberry Pi cluster for parallel computing via MPI (Messaging Passing Interface) library. This is a nice summer project now that I'm free from my Master's duties until September and I have been wanting to build this for a while. Thanks to the low prices of the Raspberry Pi we are now able to build this without spending too much. See below for the list of items you will need and price for the whole kit with 4 Pi's.
The main decision behind this architecture is to choose which operating system and programming language to use to implement parallel computing. Because of my experience with HPC (High Performance Computing) and SGE (Sun Grid Engine) the best way to achieve this is by using either OpenMPI or MPICH3. These two are free open distributions, portable and very popular. As per the programming language, we have several alternatives: we could use c++, c#, python, etc. I could use winIoT for Rpi or simply a Linux distro. I'm a geek so I went for the latter as I do like interacting with command line interfaces, there is some beauty there that I can't explain :).
So my decision is to use a Linux distribution as OS. In this case I'm choosing Raspbian Jessie which comes with some goodies installed by default and it will allow me to install all the components I need for my little project.
The second decision to make is to choose the programming language. In this case I'm choosing Python as I'm very familiar with it, it has plenty of libraries available and a nice integration with MPI via mpi4py library.
The other factor to take into account here is that I have two different models of RPi and I need to make sure that whatever I install on those will work well for both instances. I won't be able to install WinIoT to my old Rpi model A.
Building the cluster of Rpi's
The material that you will need is listed below with links included:
4 x Rpi 3 model B = 4 x £30 = £120
4 x 16Gb microSD card (Kingston) = 4 x £4.84 = £19.36
4 x USB to Micro USB Cable 0.5m = 4 x £0.88 = £3.5
2 x Multi-Pi Stackable Raspberry Pi Case = 2 x £13 = £26
1 x 5 port desktop switch = 1 x £6.49 = £6.49
5 x Ethernet patch cable 0.3m = 5 x £2.90 = £14.5
1 x USB Hub = 1 x £2.53 = £2.53
Total = £192.38 (without considering delivery)
*This is a common configuration but you can start with just 2 or 3 RPi's and keep adding hardware later on.
Once all the components are assembled using the stackable case you should have something like the image below:
Below the image of my cluster up and running (see configuration section for more):
Configuring your cluster of RPi's
The idea is to configure one of the RPi's and then just clone the SD card and plug it to the next Rpi. Here you'll find a summary description of the steps to do to get you up and running:
Installing the OS
- Download Raspbian Jessie image. I had some trouble downloading the zip file so I used the torrent link instead. See the version used below (4.4)
- Download Win32DiskImager installer. We will use this to burn Raspbian image to our SD card.
- Download PuTTY SSH client to connect to our Rpi's.
Plug the microSD card to the first Pi (my PiController in my case) and power it up. Plug the Ethernet cable and head back to your computer to access the Pi remotely.
Open a command prompt (I'm using Win10 as my main computer) and type "ping raspberrypi". By default the Rpi's are named raspberrypi so they are easy to spot in your network. Once you ping it, you will be able to see the ip address of the device. Save this IP address for later as we will use it in PuTTY.
Launch PuTTY and type the IP address of the RaspberryPi:
You should see something similar to the image below:
login as: pi and password: raspberry (each Rpi uses same login/password)
Type: sudo raspi-config to configure our device:
- Go to Expand File System
- Go to Advanced Options -> HostName -> set it to PiController
- Go to Advanced Options -> MemorySplit -> set it to 16.
- Go to Advanced Options -> SSH -> Enable.
- Finish and leave the configuration.
Now we can start installing MPICH3 and MPI4PY. Notice that these steps take a while (> 4h) so arrange some free time for this beforehand:
Installing MPICH3
Follow the steps below to install version 3.2 of MPICH:
Once you've got everything installed you should see something like the image below:
Installing MPI4PY
Follow the steps below to install version 2.0 of MPI4PY:
once installed you should see something like the image below:
Now we have finished configuring the first RPi. Believe it or not if you reach this step and everything is working you should be proud of it. Now we will have to clone this SD card and place them into the other RPi's.
Preparing the other RPi's
As mentioned in the step above, bring the SD card to your main computer and save the content of the SD card using Win32DiskImager. Now copy this new image to the other SD cards. You should have now 4 SD cards with the same image. As now we have 4 cloned SD cards, my advice is to plug every Rpi individually and change the host name of every new added Rpi into the network, e.g. pi01, pi02, pi03, etc.
Do the following for every new RPi added into the network:
pi01:
scan the network for a newly added device to find its IP address using a network scanner. Once found, use PuTTY to access it and use the commands below to set it up:
Type: sudo raspi-config to configure our device:
- Go to Expand File System
- Go to Advanced Options -> HostName -> set it to pi01
- Go to Advanced Options -> MemorySplit -> set it to 16.
- Go to Advanced Options -> SSH -> Enable.
- Finish and leave the configuration.
- sudo reboot.
Once done you should be able to see them all 4 using PuTTY:
Once completed, each Rpi will have its own IP. We need now to store each IP address into a host file also known as machinefile. This file contains the hosts which to start the processes on.
Go to your first RPi and type:
nano machinefile
and add the following IP addresses: (Note that you will have to add your own):
This will be used by the MPICH3 to communicate and send/receive messages between various nodes.
Configuring SSH keys for each RPi
Now we need to be able to command each RPi without using users/passwords. To do this we will have to generate SSH keys for each RPi and then share each key to each device under authorised devices. This is the way MPI will be able to talk to each device without worrying about credentials. This process is a bit tedious but once completed you will be able to run MPI without problems.
Run the following commands from the first Pi (PiController):
When running the ssh-keygen just hit enter (if you don't want to add specific passphrase) and the RSA key will be generated for you automatically.
Now we have configured the link between PiController to every single device but we still need to configure the other way around. So you will have to run the following commands from every individual device:
Now we have configured the link between PiController to every single device but we still need to configure the other way around. So you will have to run the following commands from every individual device:
open the authorized_keys files and you will see the additional keys there. Each authorized_keys file on each device should contain 3 keys (as stated in the architecture diagram above).
Now the system is ready for testing.
Note that if your IP address changes, the keys will not be valid and the steps will have to be repeated.
Testing the cluster
At this point I will just include a small example for you to test that the cluster works as expected. Later on I will publish a more complex scenario with a refined configuration to maximise the power of the cluster.
If everything is configured correctly, the following command should work correctly:
mpiexec -f machinefile -n 4 hostname
You can see that each Device has replied back and every key is used without problems.
Now run the following command to test a helloworld example:
mpiexec -f machinefile -n 4 python /home/pi/mpi4py-2.0.0/demo/helloworld.py
You should see something like the image below:
Now our system is ready to take any parallel computing application we want to develop.
Watch this space for more!.
Next steps
I will be creating more complex scenarios and squeezing the architecture to test its limits. Soon more!. Give it a go and let me know if you face any problem during the set up.
Jordi
I'll be interested to see how you get on with this, especially what kind of performance you get.
ReplyDeleteHi Herbert,
DeleteSure, I'm still in the coding phase testing the cluster. Soon I'll share my results.
Cheers,
Jordi
Great article. I had built one using 70 Raspberry Pis as my final year college project but they were made from Raspberry pi B+ models and we faced a lot of issues: sourcedexter.com/6-common-errors-when-building-a-raspberry-pi-supercomputer/ . However, we succeeded in building it and also ran a multi-document text summarization algorithm on it.
ReplyDeleteThank you for this tutorial.
ReplyDeleteI'm having a problem with the line
scp 192.168.1.74:/home/pi/.ssh/PiController
when I try to use that commmand I end up with
usage: scp [-12346BCpqrv] [-c cipher] etc. etc.
Any advice?
Thanks!
I forgot the . at the end, it works when you include it.
Deletescp 192.168.1.74:/home/pi/.ssh/PiController .
Good tutorial, hi i got a problem when i write the command ping raspberrypi, and it doesn't recognized this, any advice
ReplyDeleteyou mean that it doesn't recognise the ping command? you might have to make sure your system32 folder is part of your path.
DeleteMany thanks. much appreciated.
ReplyDeleteHi Jordi, very interested in this and keen to get started. The goal is less about parallel computing and more about having GNS3 on a cluster for my CCNA studies. I found this https://www.gns3.com/discussions/the-worlds-first-gns3-beowulf-cl and have had all kinds of dramas getting MPICH and MPI4PY installed. So I'll give this a go. If I get GNS 3 installed as a step just prior to the image creation/duplication.. would you expect any/many issues? I've posted questions on Jason's forum but alas he seemed to have dropped off the face of the earth in 2015 lol
ReplyDeleteHi Michael,
DeleteI don't expect many issues but you'll have to check the version and its dependencies as GNS 3 might require a different version that I used during my testing.
Cheers,
Jordi
Hi, I get to the step where I install MPI4PI and this throws a spanner in the works: python setup.py build. It gives:
ReplyDeleteAssembler messages:
Fatal error: can't create build/temp.linux-armv7l-2.7/src/MPI.o: Permission denied
error: command '/home/rpimpi/mpi-install/bin/mpicc' failed with exit status 1
weird, are you running as root? Just make sure that you have permissions to write. It sounds like you don't have enough permissions there.
Deletecheers,
Jordi
Hi dude, yeah I just ch pwned that the mpi4py-2.0.0 and it all went swimmingly.
DeleteNext step is to get GNS3 on one, then continue the rest.. then copy the image to the others and see how we go
Excellent great to hear!
DeleteHello Jordi at what point should I plug my other pis into my network switch ?
ReplyDeleteOnce you have cloned the SD cards, you can just keep plugging them in one by one and change the IP address on each one prior to connecting the next one. That should do.
DeleteCheers,
Jordi
I know this is kind of unrelated to the mpich thing [got that bit working].. Anyone know how/where to get qemu-kvm for armhf from? I've tried many places and sudo apt-get instal qemu-kvm [prereq for gns3-server] i get:
ReplyDeletePackage qemu-kvm is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source
E: Package 'qemu-kvm' has no installation candidate
Any ideas?
I don't know if this package is available for ARM. Have you tried kvmtool? It's an alternative: https://github.com/clearlinux/kvmtool
Deletehi.. i got a small problem. all works fine exept this : when i plug a usb disk or other on node 2,3 .. master node dosen't will see it. is there a fix for it ?
ReplyDeletesorry for my english but i live in switzerland and i never has learn it...
thanks for response
Hi Cristinat,
Deletethis might be a problem with that USB, you might have to enable the USB form the Pi.
Cheers,
Jordi
can i use this to run SETI@home (BOINC).
ReplyDeleteDoes the BOINC Client use all 4 Pis?
can i use this to run SETI@home (BOINC).
ReplyDeleteDoes the BOINC Client use all 4 Pis?
Hi Peter,
DeleteI don't know if BOINC is able to get the cluster altogether. You might have to run 4 individual BOINCs and then find a way to make the hardware available in some way.
Cheers,
Jordi
Hello I've been going over this tutorial with 3 of my Raspberry Pis. I'm having an issue with authorized_keys. I have created them for each pi and each pi knows the key for the others. I've confirmed the keys are identical just in case.
ReplyDeleteHowever, when checking if the configuration was successful with mpi, PiController works, but Pi01 and Pi02 are access denied.
Was wondering if something was missing or perhaps I did something wrong?
Never mind. I realized I used the PiController, Pi01 and Pi02 files, not the .pub files
DeleteCool that you could solve it. Usually it's something like that as it's quite straightforward.
DeleteCheers,
Jordi
For mpiexec -f machinefile -n 4 hostname I get...
ReplyDeletePiController
Pi01
But Pi02 is never reached and eventually times out. However I can ssh to it through PiController and Pi02 can ssh to PiController.
Any help in regards to this?
can you ping Pi02?
DeleteHey, This is a great tutorial, absolutely love it, thanks man
ReplyDeleteThanks, really happy you found it useful!
DeleteCheers,
Jordi
Hello Jordi,
ReplyDeleteFirst of all, congrats for this post !
I have a project with a friend. Building a cluster of computer with an x86 architecture and we w'de like the Rpi as frontend. Do you think this might work ? If I install debian with mpich3 and openmpi packages on the computers ?
Thank you in advance for your answer.
Best regards
Hello Jordi,
ReplyDeleteCongrats for your post !
We have a project with a friend. To build a cluster of computers but we would like the raspberry pi 3 as frontend. Do you think it might be possible ?
If we install the two packages mpich3 and openmpi ?
Thank you in advance for your answer !
Cyril
Hi Cyril,
DeleteYes, it should work.
Cheers,
Jordi
Hi, great article and would love to put something like this together. Wanted to point out that the ssh keys aren't tied to IPs so you can reuse the keys even if IP changes. If it didn't work for you, it may have been due to the Host ssh keys changing. This happens when you reinstall the OS, but the individual user keys shouldn't be affected. Just make sure to clear out ~pi/.ssh/known_hosts file and you'll be good to go.
ReplyDeleteHi
ReplyDeleteReally an excellent tutorial. Thank you very much.
Many thanks Helmut! much appreciated!
DeleteHi
ReplyDeleteThank you for this tutorial. I followed every step and everything worked fine until the final cluster test. When i try to execute mpiexec -f machinefile -n 4 hostname i'm getting following error. Do you have any idea what i did wrong?
[mpiexec@PiController] HYDU_parse_hostfile (/home/pi/mpich3/mpich-3.2/src/pm/hydra/utils/args/args.c:319): unable to open host file: machinefile
[mpiexec@PiController] mfile_fn (/home/pi/mpich3/mpich-3.2/src/pm/hydra/ui/mpich/utils.c:336): error parsing hostfile
[mpiexec@PiController] match_arg (/home/pi/mpich3/mpich-3.2/src/pm/hydra/utils/args/args.c:152): match handler returned error
[mpiexec@PiController] HYDU_parse_array (/home/pi/mpich3/mpich-3.2/src/pm/hydra/utils/args/args.c:174): argument matching returned error
[mpiexec@PiController] parse_args (/home/pi/mpich3/mpich-3.2/src/pm/hydra/ui/mpich/utils.c:1596): error parsing input array
[mpiexec@PiController] HYD_uii_mpx_get_parameters (/home/pi/mpich3/mpich-3.2/src/pm/hydra/ui/mpich/utils.c:1648): unable to parse user arguments
[mpiexec@PiController] main (/home/pi/mpich3/mpich-3.2/src/pm/hydra/ui/mpich/mpiexec.c:153): error parsing parameters
Thanks in advance
Salvi
Hi Salvi,
DeleteAre you running this with elevated permissions?
Cheers,
Jordi
Hello Jordi
ReplyDeleteCongratulations a great article, install the cluster without problems.
I would like to know if the c and c ++ libraries can be integrated into the cluster
You could indicate me where to download other examples to run them in the cluster
Thank you
Hi Armando, I don't know which libraries are you referring to.
DeleteCheers,
Jordi
Dear all, thank you for the very well done instructions. Just want to ask you how to start all the nodes, wich is the command. When i launch my command from the master "./cpuminer ..." the other nodes doesn't listen this command, it seems that only my master is working. Thank you for any help.
ReplyDeleteHi, correct machinefile: set PiController IP on the last line.
DeleteAnd then:
mpiexec -machinefile /home/pi/machinefile -n 3 ~/cpuminer-multi/./cpuminer ...
Hello Mr. Corbilla, great turoial. Only 1 of 3 tutorials that has allowed me to run MPI.
ReplyDelete!!!PROBLEM!!!
INPUT:
mpiexec -f machinefile -n 4 hostname
OUTPUT:
mpiexec: Error: unknown option "-f"
Type 'mpiexec --help' for usage.
Please advise.
P.S. Happy Holidays in advance.
mpiexec -machinefile /home/pi/machinefile -n 4 hostname
DeleteThank you, but this only gives me one host name witch is my PiController.
DeleteINPUT
mpiexec -machinefile /home/pi/machinefile -n 4 hostname
OUTPUT
PiController
PiController
Picontroller
Picontroller
I have changed the order of IP adresses in the machinefile and it will only read the first IP and give that hostname 4 times.
Hi Jared,
DeleteThat behaviour is a bit weird. Can you test your machinefile with just 2 IPs? and see if the behaviour is the same?
Cheers,
Jordi
This comment has been removed by the author.
ReplyDeleteHi Diwakar,
DeleteHave you configured correctly MPI on those nodes? You should only get 1 output in the master node.
Cheers,
Jordi
This comment has been removed by the author.
DeleteHi,
ReplyDeleteWhat is the difference between a Beowulf cluster and an MPI cluster?
Hi
ReplyDeleteFirst of all a great tutorial. I simply copy pasted it for further use. I am planning to build a 32 node nano super computer using raspberry pi3B models. My aim is not just to try out the rig for computational fluid dynamics but also for running certain codes through my web browser, say firefox. Right now I am in the process of buying the materials as a 32 node system will cost me about 100,000 INR. My main concern is as follows. 1. Can I use Ubuntu OS for the master as well as the slave nodes? 2. If I open a web browser through the master node and run my program/codes inside the browser will it be able to use the combined CPU power of all the nodes?
Thanks in advance
Hi Shubham,
DeleteFor what you are after you have to follow a different approach, with the current design it won't work. I'm working towards this idea so, when I have it done I will publish it.
Cheers,
Jordi
Hi Jordi,
ReplyDeletegreat tutorial, got the cluster running. Still being a python and raspberry NOOB I was wondering if you could help me out. I got cpuminer-multi running on the main node only I want to run it om all nodes in my cluster, could you tell me how I could accomplish this?
Thanks,
Simon
Hi Simon,
DeleteYou'll have to look into it. That package has different settings so probably there is help online talking about it.
Cheers,
Jordi
Hi Jordi,
ReplyDeleteYou should look at this article. Someone is using your work. https://www.techworm.net/2018/03/learn-build-supercomputer-raspberry-pi-3-cluster.html
Hi Jordi,
ReplyDeleteGreat tutorial! I got my cluster running.
but can you give some of code or project that test performance of cluster (execution time) ?
sorry for my english.
Regards,
Ahmad Ridwan
Nice post, Thanks for sharing with us
ReplyDeleteYay, I got mine to work. I had slightly older software as I was following a set of instructions very similar to this so when it came to the final test, I had to change the mpi4py version to the one I am using and it worked. I am so happy.
ReplyDeleteHi I have a problem with the 'nstalling MPI4PY' step.
ReplyDeleteWhen I try to execute 'sudo python setup.py install' it gives me this error:
_configtest.c:2:17: fatal error: mpi.h: File o directory non esistente
#include
^
compilation terminated.
failure.
removing: _configtest.c _configtest.o
error: Cannot compile MPI programs. Check your configuration!!!
how can I fix it? Thanks for your work, it's awesome
Dear Sir,
ReplyDeleteI did search over internet the relevant Apps.
What Apps you may use with Rpi supercomputer? Like daily or commercial use. Please email me those Apps. Thanks a lot.
duncan.tse@gmail.com
when i try to run this command: mpiexec -machinefile /home/pi/machinefile -n 4 hostname
ReplyDeleteI dont get any output please if you could help.
Thanks in advance
I am using four pi and wanted to ask, will this improve the performance on the desktop of the main pi? Also, can you connect two now and do the other two from the main pi itself? (I am using a switch I already have, and it has only four ports)
ReplyDelete