This article provides an introduction to PiCloud, a cloud computing solution.
There are three components to PiCloud:
- PiCloud refers to the cloud computing infrastructure provided by PiCloud Inc.
- Client refers to the computer or program that is interacting with PiCloud, i.e., submitting jobs, retrieving results, etc.
- cloud refers to the Python library used to interact with the cloud.
Installation
Although primarily a commercial offering, PiCloud has five free compute hours as part of their Free Developer Account, which should be good enough to get a feel of working with it. The first step is to register at on their website, which will require you to confirm your email account.
The next step is to install the Python client, which is most easily done with pip, i.e., sudo pip install cloud
will do the installation for you (it currently supports Python 2.5, 2.6 and 2.7). Next, you want to authenticate your computer to use PiCloud, where you have to enter your PiCloud login information provided in the following registration step:
$ sudo picloud setup Please enter your PiCloud account login information. If you do not have an account, please create one at http://www.picloud.com E-mail: echorand@gmail.com Password:
PiCloud uses API keys, rather than your login information, to authenticate your machine. In the event your machine is compromised, you can deactivate your API key to disable access. In this next step, you can choose to use an existing API key for this machine, or create a new one. We recommend that each machine has its own unique key.
Your API Key(s) 3241 Please select an API Key or just press enter to create a new one automatically: API Key: 3242
Please note that your API key may be different. You can see the API keys you have registered in the Web console when you log in to your account at picloud.com, where you will also find a secret key corresponding to your API key. You will need to make this information available to PiCloud when you try to submit your job. One way of doing this is setting it in the file, $HOME/.picloud/cloudconf.py
. For alternative options, please refer to their documentation.
That’s all we need to start running code on PiCloud. Let us start with something really simple.
Running simple functions in the cloud
Now, let’s write a simple function and run it on PiCloud. Let us first define a function sort_num
in the file demo.py
, as follows:
import numpy def sort_num(num): sort_num = numpy.sort(num) return sort_num
Fire up the Python interpreter and import the function just defined, and the modules cloud
and numpy
:
>>> from demo import * >>> import cloud >>> import numpy
Next, let us create an array of 50,000 integers, with each integer picked randomly from the range (10, 10000):
>>> num=numpy.random.random_integers(10,10000,50000)
Now comes the important step: invoking the cloud.call
method to specify the function that you want to run, along with its arguments:
>>> jid=cloud.call(sort_num, num)
This implies that you want the function sort_num
to be executed with the argument num
. This method returns an integer, which is an identifier for this particular job. The formal specification of the cloud.call method is available on their documentation.
>>> jid 58
The returned value of the executed function can be obtained using the cloud.result
method, passing the obtained jid
:
>>> cloud.result(jid) array([ 10, 11, 11, ..., 10000, 10000, 10000])
As you can see, the returned array is obtained.
Now, let’s assume that you want to sort 100 similar arrays. You can do this by invoking 100 cloud.call
methods or you could use a more efficient mechanism — the cloud.map
method.
Mapping
Consider a function that returns the volume of a cylinder when the radius(r) and height(h) is passed to it:
import numpy def vol_cylinder(r, h): return numpy.pi*r*r*h
If you had one cylinder configuration, i.e., one pair of (r, h) then the cloud.call
method could be used for the purpose. What if you had 100 different configurations to find the volume of? You could use 100 cloud.call
invocations, but there is a more efficient way of doing this — use the cloud.map
function, which accepts a sequence(s) of parameter(s):
>>> from mapdemo import * >>> import cloud >>> import numpy >>> r=numpy.random.random_sample(100) >>> h=numpy.random.random_sample(100) >>> jids=cloud.map(vol_cylinder, r, h) >>> jids xrange(359, 459)
As you can see, jids
is a list of 100 elements with values from 359 to 459 (note that the values you get may be different). What has happened here is that cloud.map has created 100 jobs with the arguments formed from the 100 pairs of the two sequences, r and h. You can pass this list of jids
directly to cloud.result
to get the results as a list:
>>> result=code.result(jids) >>> result [0.35045338986267927, 0.0043144690799585004, 0.094018119621969765, 0.93579722612039329, 0.0045154147876736109, 0.018836324478609345, 0.0027243595262778321, 1.5049675511377265, 0.37383416636274164, 0.24435487403102638, 0.28248315493701553, 1.2879340600324913, 0.68406526971023041, 0.14338739850272786,...
In more practical situations, the function computation may not be so trivial, and hence you wouldn’t have a definite idea when all the jobs will be over. In such a case, you can wait till all the jobs have finished to retrieve the results, using the function cloud.join
.
Cloud files
PiCloud’s S3 data store can be used for persistent data storage on the cloud. The interface to this store is provided by the cloud.files
module. The cloud.files.list
method can be used to display all the files in your account:
>>> cloud.files.list() [ ]
Since you haven’t yet stored any files on PiCloud, an empty list is returned. To store a file on the cloud, the cloud.files.put
method is used (this basically transfers the file from your local disk to the cloud):
>>> cloud.files.put('stats_1.csv') >>> cloud.files.list() [u'stats_1.csv']
This file can be retrieved using the cloud.files.get
method:
>>> cloud.files.get('stats_1.csv')
There are methods for checking the existence of a file, deleting a file and opening a file on the cloud. There are also methods for syncing files, which would transfer a file from/to the cloud only if the file has changed — especially useful when you have large files to work with. A simple example to show how you would go about storing files on the PiCloud is shown below.
Create a file filedemo.py
as shown:
import cloud def savedata(): f=open('data.txt','w') f.write('This is a line of text') f.close() cloud.files.put('data.txt')
Now, in the interpreter, use the following code:
>>> from filedemo import * >>> cloud.call(savedata) 459 >>> cloud.files.list() [u'data.txt', u'stats_1.csv'] >>> cloud.files.get('data.txt')
As you can see, the file data.txt
has been created in the cloud, and can later be retrieved using cloud.files.get
. This is useful when you run code that generates data that you need to examine.
Publishing functions via REST
An interesting capability that PiCloud offers is to publish functions on the cloud via a REST interface. That is, you can have a function written in Python on the cloud, and you can have a REST client written in any other language calling this function. Let us jump straight into an example from PiCloud’s documentation on this topic. Create a file publish_square.py
with the following function:
# Refer: http://docs.picloud.com/rest.html import cloud def square(x): """Returns square of a number""" print 'Squaring %d' % x return x*x
Now, publish this function using the cloud.rest.publish
method:
>>> uri=cloud.rest.publish(square, "square_func") >>> print uri https://api.picloud.com/r/3222/square_func
The above statement makes the function square accessible via the REST API, and returns a URI for the same. You can now access this by making an appropriate call to the API using a command-line tool such as curl
, or by using any other programming language.
In this article, let us examine a C client for this. We used libcurl
‘s C interface to replicate the curl
statements provided in the PiCloud documentation. The client is written in two parts: client1.c
and client2.c
.
Client1.c
invokes the published function, hence gets the job ID.
/* This part of the client invokes the REST API of PiCloud and retrieves the Job ID http://docs.picloud.com/rest.html#invoking-functions */ #include <stdio.h> #include <curl/curl.h> int main(void) { CURL *curl; /* Make sure you set this appropriately*/ char *url="https://api.picloud.com/r/3222/square_func/"; CURLcode res; curl = curl_easy_init(); if(curl) { /* First set the URL that is about to receive our POST. This URL can just as well be a https:// URL if that is what should receive the data. */ curl_easy_setopt(curl, CURLOPT_URL, url); /* Specify the user/pass */ curl_easy_setopt(curl,CURLOPT_USERPWD,"3244:8823b533ef41975505c8dbe46a2f85b930428944"); /* Now specify the POST data */ curl_easy_setopt(curl, CURLOPT_POSTFIELDS, "x=5"); /* For HTTPS */ curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 0L); /* Perform the request, res will get the return code */ res = curl_easy_perform(curl); printf("\nResult of Operation:: %d\n", res); /* always cleanup */ curl_easy_cleanup(curl); } return 0; }
Client2.c
then uses this job ID to get the result.
/* This part of the client retrieves the result given the job ID as the argument*/ #include <stdio.h> #include <stdlib.h> #include <curl/curl.h> int main(int argc, char *argv[]) { CURL *curl; char url[80]; CURLcode res; if (argc==1) { printf("Usage: ./client2 <jid>\n"); exit(0); } strcpy(url,"https://api.picloud.com/job/result/?jid="); strcat(url,argv[1]); curl = curl_easy_init(); if(curl) { /* First set the URL that is about to receive our POST. This URL can just as well be a https:// URL if that is what should receive the data. */ curl_easy_setopt(curl, CURLOPT_URL, url); /* Specify the user/pass */ curl_easy_setopt(curl,CURLOPT_USERPWD,"3244:8823b533ef41975505c8dbe46a2f85b930428944"); /* for HTTPS */ curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 0L); /* Perform the request, res will get the return code */ res = curl_easy_perform(curl); printf("\nResult of Operation:: %d\n", res); /* always cleanup */ curl_easy_cleanup(curl); } return 0; }
Compile and run the programs, as follows:
$ gcc -o client1 client1.c -lcurl $ gcc -o client2 client2.c -lcurl $ ./client1 {"jid": 57, "version": "0.1?} Result of Operation:: 0 $ ./client2 57 {"version": "0.1?, "result": 25} Result of Operation:: 0
For detailed information on publishing functions, please refer to the PiCloud documentation on this topic.
Running an Evolutionary Algorithm framework in the cloud
Our examples in this article have all been rather simple, since they were meant to clear the concepts. In a more real-life use of PiCloud, let us take a look at a simple way of using PiCloud in the domain of Evolutionary Algorithms.
There are a number of ways in which an Evolutionary Algorithm can be parallelised. One of the easiest things to do is to run parallel instances of the algorithm with different initial random seeds. (This is a common practice in the research community, since starting with different random seeds allows you to test the robustness of a new algorithm, or the correct implementation of an existing one.)
In a blog post, I detailed the first exercise that I tried with Pyevolve (a Python library for Evolutionary Algorithms) + PiCloud — to run Pyevolve’s Genetic Algorithm implementation with 10 different initial seeds on PiCloud, all running in parallel.
PiCloud Web interface
Once you log in to your account on PiCloud, you should see the PiCloud Web control panel, where you can find the current running jobs, API keys, published functions and much more.
In this article, we have taken a very basic tour of PiCloud, and looked at most of its important features. We haven’t, however, looked at features like the ability to run cron jobs, using different computing powers and using environments. You are requested to consult the official documentation, which I have listed below, for these and more detailed discussions and examples on the topics we have discussed in this article.
Note: The REST API example was earlier published on my blog.
Hi Everyone, in addition to this article, you might be interested in taking a look at the slides of a talk I gave recently on PiCloud. Please see: http://echorand.me/2012/08/17/pyconau-2012-talk-on-picloud/.
Thanks for good article!