Saturday, 30 July 2016

Creating a Raspberry Pi 3 Cluster - "Supercomputer", for parallel computing.

In this quick article I will show you how to create your own Raspberry Pi cluster for parallel computing via MPI (Messaging Passing Interface) library. This is a nice summer project now that I'm free from my Master's duties until September and I have been wanting to build this for a while. Thanks to the low prices of the Raspberry Pi we are now able to build this without spending too much. See below for the list of items you will need and price for the whole kit with 4 Pi's. 

The main decision behind this architecture is to choose which operating system and programming language to use to implement parallel computing. Because of my experience with HPC (High Performance Computing) and SGE (Sun Grid Engine) the best way to achieve this is by using either OpenMPI or MPICH3. These two are free open distributions, portable and very popular. As per the programming language, we have several alternatives: we could use c++, c#, python, etc. I could use winIoT for Rpi or simply a Linux distro. I'm a geek so I went for the latter as I do like interacting with command line interfaces, there is some beauty there that I can't explain :). 
So my decision is to use a Linux distribution as OS. In this case I'm choosing Raspbian Jessie which comes with some goodies installed by default and it will allow me to install all the components I need for my little project.

The second decision to make is to choose the programming language. In this case I'm choosing Python as I'm very familiar with it, it has plenty of libraries available and a nice integration with MPI via mpi4py library.

The other factor to take into account here is that I have two different models of RPi and I need to make sure that whatever I install on those will work well for both instances. I won't be able to install WinIoT to my old Rpi model A.

Building the cluster of Rpi's

The material that you will need is listed below with links included:

4 x Rpi 3 model B = 4 x £30 = £120
4 x 16Gb microSD card (Kingston) = 4 x £4.84 = £19.36
4 x USB to Micro USB Cable 0.5m = 4 x £0.88 = £3.5
1 x 5 port desktop switch = 1 x £6.49 = £6.49
5 x Ethernet patch cable 0.3m = 5 x £2.90 = £14.5
1 x USB Hub = 1 x £2.53 = £2.53

Total =  £192.38 (without considering delivery)

*This is a common configuration but you can start with just 2 or 3 RPi's and keep adding hardware later on.


Once all the components are assembled using the stackable case you should have something like the image below:


Below the image of my cluster up and running (see configuration section for more):


Configuring your cluster of RPi's


The idea is to configure one of the RPi's and then just clone the SD card and plug it to the next Rpi. Here you'll find a summary description of the steps to do to get you up and running:

Installing the OS

  • Download Raspbian Jessie image. I had some trouble downloading the zip file so I used the torrent link instead. See the version used below (4.4)
Once the OS image is downloaded, burn it to the SD card using Win32DiskImager:

Plug the microSD card to the first Pi (my PiController in my case) and power it up. Plug the Ethernet cable and head back to your computer to access the Pi remotely.

Open a command prompt (I'm using Win10 as my main computer) and type "ping raspberrypi". By default the Rpi's are named raspberrypi so they are easy to spot in your network. Once you ping it, you will be able to see the ip address of the device. Save this IP address for later as we will use it in PuTTY.

Launch PuTTY and type the IP address of the RaspberryPi:

You should see something similar to the image below:

login as: pi and password: raspberry (each Rpi uses same login/password)

Type: sudo raspi-config to configure our device:
  1. Go to Expand File System
  2. Go to Advanced Options -> HostName -> set it to PiController
  3. Go to Advanced Options -> MemorySplit -> set it to 16.
  4. Go to Advanced Options -> SSH -> Enable.
  5. Finish and leave the configuration.

Now we can start installing MPICH3 and MPI4PY. Notice that these steps take a while (> 4h) so arrange some free time for this beforehand:

Installing MPICH3


Follow the steps below to install version 3.2 of MPICH:
Once you've got everything installed you should see something like the image below:

Installing MPI4PY


Follow the steps below to install version 2.0 of MPI4PY:
once installed you should see something like the image below:

Now we have finished configuring the first RPi. Believe it or not if you reach this step and everything is working you should be proud of it. Now we will have to clone this SD card and place them into the other RPi's.

Preparing the other RPi's


As mentioned in the step above, bring the SD card to your main computer and save the content of the SD card using Win32DiskImager. Now copy this new image to the other SD cards. You should have now 4 SD cards with the same image. As now we have 4 cloned SD cards, my advice is to plug every Rpi individually and change the host name of every new added Rpi into the network, e.g. pi01, pi02, pi03, etc.

Do the following for every new RPi added into the network:

pi01:

scan the network for a newly added device to find its IP address using a network scanner. Once found, use PuTTY to access it and use the commands below to set it up:

Type: sudo raspi-config to configure our device:
  1. Go to Expand File System
  2. Go to Advanced Options -> HostName -> set it to pi01
  3. Go to Advanced Options -> MemorySplit -> set it to 16.
  4. Go to Advanced Options -> SSH -> Enable.
  5. Finish and leave the configuration.
  6. sudo reboot.
Do the same for pi02 and pi03. Note that you can name your RPis the way you want.

Once done you should be able to see them all 4 using PuTTY:

Once completed, each Rpi will have its own IP. We need now to store each IP address into a host file also known as machinefile. This file contains the hosts which to start the processes on.

Go to your first RPi and type:

nano machinefile

and add the following IP addresses: (Note that you will have to add your own):
This will be used by the MPICH3 to communicate and send/receive messages between various nodes.

Configuring SSH keys for each RPi


Now we need to be able to command each RPi without using users/passwords. To do this we will have to generate SSH keys for each RPi and then share each key to each device under authorised devices. This is the way MPI will be able to talk to each device without worrying about credentials. This process is a bit tedious but once completed you will be able to run MPI without problems.


Run the following commands from the first Pi (PiController):
When running the ssh-keygen just hit enter (if you don't want to add specific passphrase) and the RSA key will be generated for you automatically.

Now we have configured the link between PiController to every single device but we still need to configure the other way around. So you will have to run the following commands from every individual device:

open the authorized_keys files and you will see the additional keys there. Each authorized_keys file on each device should contain 3 keys (as stated in the architecture diagram above).

Now the system is ready for testing.

Note that if your IP address changes, the keys will not be valid and the steps will have to be repeated. 

Testing the cluster


At this point I will just include a small example for you to test that the cluster works as expected. Later on I will publish a more complex scenario with a refined configuration to maximise the power of the cluster.

If everything is configured correctly, the following command should work correctly:

mpiexec -f machinefile -n 4 hostname


You can see that each Device has replied back and every key is used without problems.

Now run the following command to test a helloworld example:

mpiexec -f machinefile -n 4 python /home/pi/mpi4py-2.0.0/demo/helloworld.py

You should see something like the image below:

Now our system is ready to take any parallel computing application we want to develop.

Watch this space for more!.

Next steps


I will be creating more complex scenarios and squeezing the architecture to test its limits. Soon more!. Give it a go and let me know if you face any problem during the set up.

Jordi

Saturday, 25 June 2016

BaaS with Kinvey and Delphi 10.1 Berlin

In this article I will show you how to connect your desktop and mobile applications to a mobile backend as a service (mBaaS) with Delphi 10.1 Berlin. I normally use Parse.com as a backend but as they announced that they will close their mBaaS service I will use Kinvey instead.
If you are interested in Parse.com you can read my previous articles about creating your own self-hosted Parse server and deploying a Parse server to Heroku which achieves similar result to what I want to explain you today (basically having your data hosted anywhere in the cloud either using Kinvey as an mBaaS or Parse + Heroku as a PaaS).
 There are plenty of articles online about these topics and I just want to give you my input and how I dealt with some of the challenges I faced during development.

Join Kinvey and create your App environment
First step is to join Kinvey and create your application environment. Kinvey offers a Free plan for developers that it is ideal to test your applications. It includes the core mBaaS features and 1GB of data storage. Once you move your solution into production you can switch over one of the other plans available.
Once you are in the console management, press "New App" and enter the name of your backend:
Now you should see your enviromnent created:
Click on the Development label and the dashboard will be shown:
At this moment in time, you should know what data you are going to store. In this example I have created a simple collection that includes several fields for a sample application that I'm building and that the source code can be found here:
Here is my collection of values:
*Note that if you POST data to a non-existing collection, this will be created automatically.

Handling collections
Now it's time to start using the collection and querying/adding data via Http REST. For this task you will have to identify the required headers that are needed for your GET/POST requests. You can see one example here for Parse.

Kinvey follows a different approach than Parse in terms of security. Kinvey offers basic and session authentiation. Basic authentication uses the HTTP header "Authorization" with the components "Basic" and Base64Encode(AppId:MasterSecret). Session authentication sends a login request to collct an auth token from Kinvey backend and then this token is used in subsequent REST requests.

I will focus on Basic authentication as Session authentication is as simple but with more steps.

The first example is by using REST via IdHttp so you can see how GET/POST are handled manually and the second example is just by using the Kinvey provider component available with our Delphi 10.1 Berlin that will make our life easier.

Here it's my Win64 example that uses Kinvey BaaS:

Add action, adds data into the collection and reload just brings the data back from the cloud and displays it in a listview component.

Here is the code behind it:
Add item:
Load data:
As you can see, each method uses a POST/GET command to baas.kinvey url with some arguments, headers and options. I find this way really useful as you can clearly see what's going on and easily map what you could do via curl:

If you want to run this example successfully you'll need to include libeay32.dll and ssleay32.dll from OpenSSL.

You can test this app and its mobile companion app using the source code here.

Using Kinvey Provider component
Delphi 10.1 Berlin has a KinveyProvider component that we can use directly without having to worry about the request details. You'll see below that to do the same as the code sample above we will need just few lines of code:
The KinveyProvider needs the parameters:

  • AppKey
  • AppSecret
  • MasterSecret
Get those from the Kinvey dashboard and off you go.

Then connect the BackEndStorage component to the KinveyProvider component.

Now, here is the code to Add items to the collection and to load the collection via components:
Load Data:
Add Item:
As you can see now it's way simpler.

And here my android app up and running:
Now I have a Win64 app and an Android app that share the same back end using Kinvey BaaS.

And the data in Kinvey:

Although this article it's quite long I'm sure you will find it quite interesting if you still haven't played with these components and the cloud.

Note that this BackEnd service will cease to exist shortly (as the appkey is hardcoded in the android app and the source code is available).

Do not hesitate to contact me if you have any questions.
Jordi

Monday, 6 June 2016

Invoke PowerShell remote command with parameters with spaces via TeamCity

One of the coolest things to do with TeamCity is to run some external applications via PowerShell. This will allow us to invoke remote commands without having to install an additional agent on the remote target and it will allow us to centralise those commands from our build agent environment.

The idea behind it is as follows:
I have a centralised Build Agent environment and I need to run a command line application en two additional machines. I don't want to install any TeamCity agent on those machines as these are just deployment machines and should be independent of the bulding process. I just need to remote deploy some binaries there via Powershell and then execute the application.
These two additional machines are in the same network and WinRM is configured in every instance with the correct permissions so TeamCity can run the commands without problems.
There are loads of guides over the internet regarding WinRM configuration. Here is the screenshot from my local configuration on my Win10 machine. I run the same on the TeamCity agent, a Win2012 machine:

I had to tweak first with the wifi connection as it was set to public by default and it needs to be private. Then just enable PSRemoting and add the trusteshosts as all (*). Then to test it out, run the Test-WSMan myRemoteMachine and you should see something similar to the image above.

In TeamCity, there are two additional steps, one to copy some binaries to a remote machine and the second one to run the binaries remotely. Both steps running powershell:

Here is the command for each step:

And the remote execution from Powershell:

Monday, 30 May 2016

Saturday, 28 May 2016

Getting dotCover to report in TeamCity via command line parameters

This is a series of articles about one of my favourite subjects: Continuous Integration and Continuous Delivery. In this article I will focus on the Development step that includes the build process, the packaging of artifacts, the testing and finally the reporting of code coverage. All this from TeamCity.

I will base this solution in one of my .net projects as the integration is really high and it is really easy to do and use. I faced some challenges when setting up NUnit 3.x and dotCover in TeamCity and this post tries to alleviate the pain with a deep explanation about the setting up and configuration.

In further articles I will delve into detail regarding the next items that are part of the continuous delivery pipeline.

One of the most important aspects of any solution is that if you want to get away from spaghetti code you need to have proper controls in place such as unit tests, integration tests, etc. This for me is a must-have in any project. If you can't answer the following question "What's your % of code coverage?" with something rather than 0, then you are in for trouble as you are probably maintaining something similar to spaghetti code. This doesn't relate to the ability you have for coding, this only relates to its maintainability and scalability.

I will use one of my projects from GitHub for this article:
This project contains a basic library with my MapReduce approach, a console application that runs the library and a unit test project that tries to cover as much code as possible.

Here is my TeamCity project for this github solution with the artifacts.
Artifacts:
As you can see, the project gets automatically build via TeamCity and the configuration of the project is as follows:

1. VCS Root:

2. Build Step for Visual Studio:

3. Generation of Artifacts:

As you can see, the setting up of the build process is quite simple. Just create your project in teamcity, point your VCS Root to GitHub, create a build step for Visual Studio (pointing it to the .sln file) and add the location of the binaries to be picked up by TeamCity and generated as artifacts.

So far so good, right!?. Now we need to go one step forward. My project has a console app and a unit test project and I need to build both prior to run the unit test project. To achieve this I will have to piggyback on MsBuild.

Here is the structure of my project:
Now I need to build the projects CountingWordsConsole and MapReduce.Tests together prior to the running of the tests. Here is my msbuild file to build my solution:

And here is the configuration in TeamCity:
I got rid of the previous build step for Visual Studio and this time I'm creating an MsBuild step that will target all my project files.

You will also notice that in the MsBuild file there is a shared argument between TeamCity and MsBuild called ReleaseFolder. This property is set up in TeamCity with the folder location of where my projects will be built. This will help us later as to identify where are our binaries and how we pick those up from TeamCity as artifacts.
Notice that in the MsBuild file the notation of this property is via $(ReleaseFolder) whereas in TeamCity is %system.ReleaseFolder%.

Now that we have the project up and running via MsBuild, it's time to set up NUnit and dotCover.

Setting up NUnit and dotCover

1. Get the latest NUnit.
You can get the latest NUnit 3.2.1 from here. Download the .msi file and install the typical installation. This will leave the files in the following folder:

  • C:\Program Files (x86)\NUnit.org\nunit-console\
Add a new build step in TeamCity and configure it to run it for NUnit 3.x. As soon as you run the project you will get the following error:

This version of NUnit 3 is not a release version and is not compatible with TeamCity. Please update NUnit to a newer release version.


I even tried with NUnit 3.0 RC and NUnit 3.0 but the error never went away. How to fix this? via command line.

To make things a bit more exciting, I will configure directly dotCover as this one will run NUnit by default. dotCover comes automatically by default with TeamCity:
("C:\TeamCity\buildAgent\tools\dotCover") and you only need to configure your project with some configuration files to be able to run dotCover from the command line.

The other aspect to cover later on is how to publish the dotCover report in teamCity but I will get to it.

2. Using dotCover
Using dotCover is really simple, just call dotCover with the argument cover and the .xml config and that's it. Inside the configuration file, dotCover will call NUnit and report the tests back to TeamCity. Here is my configuration file:

Now configure the build step to use dotCover:
Once you run this step, you will see that TeamCity reports back this test automatically:
Now we are almost done. The last part is the tricky one and the answer to the title of this article "Getting dotCover to report in TeamCity via command line parameters". If you check the coverage.xml file, you will see that there is one report created called output.dcvr and now we need to tell TeamCity that the file is there to be picked up.

3. Using service messages.
The common way to report things back to TeamCity is via service messages. To do this, you just need to write this message in the console output:

##teamcity[importData type='dotNetCoverage' tool='dotcover' path='C:\MapReduce\MapReduce.Tests\output.dcvr']

So what I did was building a lightweight console application that just writes that command output:

Now place the executable in the root of your project so it can be called from TeamCity. Add a third build step in your project and use the following arguments:
Run the project in TeamCity and et voilĂ !:

Now you can see the coverage of my unit tests against my source code (>75%) and inspect the report further through TeamCity:

I can see that most of my classes are covered 100% so it gives me enough confidence to keep modifying the project knowing that if I break something the tests will tell me straight away.

I hope you find this useful as it's a quite a long and tedious post and I consider that you already know about TeamCity.
Jordi