Apache CloudStack: A Reliable and Scalable Cloud Computing Platform

0
7660

Apache CloudStack is yet another outstanding project that has contributed many tools and projects to the open source community. The author has selected the relevant and important extracts from the excellent documentation provided by the Apache CloudStack project team for this article.

Apache CloudStack is one among the highly visible projects from the Apache Software Foundation (ASF). The project focuses on deploying open source software for public and private Infrastructure as a Service (IaaS) clouds. Listed below are a few important points about CloudStack.

  • It is designed to deploy and manage large networks of virtual machines, as highly available and scalable Infrastructure as a Service (IaaS) cloud computing platforms.
  • CloudStack is used by a number of service providers to offer public cloud services and by many companies to provide on-premises (private) cloud offerings or as part of a hybrid cloud solution.
  • CloudStack includes the entire ‘stack’ of features that most organisations desire with an IaaS cloud — compute orchestration, Network as a Service, user and account management, a full and open native API, resource accounting, and a first-class user interface (UI).
  • It currently supports the most popular hypervisors — VMware, KVM, Citrix XenServer, Xen Cloud Platform (XCP), Oracle VM server and Microsoft Hyper-V.
  • Users can manage their cloud with an easy-to-use Web interface, command line tools and/or a full-featured RESTful API. In addition, CloudStack provides an API that’s compatible with AWS EC2 and S3 for organisations that wish to deploy hybrid clouds.
  • It provides an open and flexible cloud orchestration platform to deliver reliable and scalable private and public clouds.

Features and functionality

Some of the features and functionality provided by CloudStack are:

  • Works with hosts running XenServer/XCP, KVM, Hyper-V, and/or VMware ESXi with vSphere
  • Provides a friendly Web-based UI for managing the cloud
  • Provides a native API
  • May provide an Amazon S3/EC2 compatible API
  • Manages storage for instances running on the hypervisors (primary storage) as well as templates, snapshots and ISO images (secondary storage)
  • Orchestrates network services from the data link layer (L2) to some application layer (L7) services, such as DHCP, NAT, firewall, VPN and so on
  • Accounting of network, compute and storage resources
  • Multi-tenancy/account separation
  • User management
Figure 1: A simplified view of a basic deployment

Support for multiple hypervisors: CloudStack works with a variety of hypervisors and hypervisor-like technologies. A single cloud can contain multiple hypervisor implementations. As of the current release, CloudStack supports BareMetal (via IPMI), Hyper-V, KVM, LXC, vSphere (via vCenter), Xenserver and Xen Project.

Figure 2: A region with multiple zones

Massively scalable infrastructure management: CloudStack can manage tens of thousands of physical servers installed in geographically distributed data centres. The management server scales near-linearly, eliminating the need for cluster-level management servers. Maintenance or other outages of the management server can occur without affecting the virtual machines running in the cloud.Automatic cloud configuration management: CloudStack automatically configures the network and storage settings for each virtual machine deployment. Internally, a pool of virtual appliances supports the configuration of the cloud itself. These appliances offer services such as firewalling, routing, DHCP, VPN, console proxy, storage access, and storage replication. The extensive use of horizontally scalable virtual machines simplifies the installation and ongoing operation of a cloud.

Graphical user interface: CloudStack offers an administrator’s Web interface that can be used for provisioning and managing the cloud, as well as an end user’s Web interface, for running VMs and managing VM templates. The UI can be customised to reflect the desired look and feel that the service provider or enterprise wants.

API: CloudStack provides a REST-like API for the operation, management and use of the cloud.

AWS EC2 API support: It provides an EC2 API translation layer to permit common EC2 tools to be used in the CloudStack cloud.

High availability: CloudStack has a number of features that increase the availability of the system. The management server itself may be deployed in a multi-node installation where the servers are load balanced. MySQL may be configured to use replication to provide for failover in the event of a database loss. For the hosts, CloudStack supports NIC bonding and the use of separate networks for storage as well as iSCSI Multipath.

Deployment architecture

CloudStack deployments consist of the management server and the resources to be managed. During deployment, you inform the management server of the resources to be managed, such as the IP address blocks, storage devices, hypervisors and VLANs.

The minimum installation consists of one machine running the CloudStack management server and another machine acting as the cloud infrastructure. In its smallest deployment, a single machine can act as both the management server and the hypervisor host.

Figure 3: Installation complete

A more full-featured installation consists of a highly-available multi-node management server and up to tens of thousands of hosts using any of several networking technologies.Management server overview: The management server orchestrates and allocates the resources in your cloud deployment. It typically runs on a dedicated machine or as a virtual machine. It controls the allocation of virtual machines to hosts, and assigns storage and IP addresses to the virtual machine instances. The management server runs in an Apache Tomcat container and requires a MySQL database for persistence.

The management server:

  • Provides the Web interface for both the administrator and the end user
  • Provides the API interfaces for both the CloudStack API as well as the EC2 interface
  • Manages the assignment of guest VMs to a specific compute resource
  • Manages the assignment of public and private IP addresses
  • Allocates storage during the VM instantiation process
  • Manages snapshots, disk images (templates) and ISO images
  • Provides a single point of configuration for your cloud

Cloud infrastructure overview: Resources within the cloud are managed as follows.

  • Regions: This is a collection of one or more geographically proximate zones managed by one or more management servers.
  • Zones: Typically, a zone is equivalent to a single data centre. It consists of one or more pods and secondary storage.
  • Pods: A pod is usually a rack, or row of racks that includes a Layer-2 switch and one or more clusters.
  • Clusters: A cluster consists of one or more homogenous hosts and primary storage.
  • Host: This is a single compute node within a cluster; often, a hypervisor.
  • Primary storage: This is a storage resource typically provided to a single cluster for the actual running of instance disk images.
  • Secondary storage: This is a zone-wide resource which stores disk templates, ISO images, and snapshots.

Networking overview: CloudStack offers many types of networking, but these typically fall into one of two scenarios.

  • Basic: This is analogous to AWS-classic style networking. It provides a single flat Layer-2 network, where guest isolation is provided at Layer-3 by the hypervisors bridge device.
  • Advanced: This typically uses Layer-2 isolation such as VLANs, though this category also includes SDN technologies such as Nicira NVP.

Installation

In this section, let us look at the minimum system requirements and installation steps for CloudStack.

Management server, database and storage system requirements: The machines that will run the management server and MySQL database must meet the following requirements. The same machines can also be used to provide primary and secondary storage, such as via local disks or NFS. The management server may be placed on a virtual machine.

  • Preferred OS: CentOS/RHEL 6.3+ or Ubuntu 14.04 (.2)
  • 64-bit x86 CPU (more cores lead to better performance)
  • 4GB of memory
  • 250GB of local disk space (more space results in better capability; 500GB recommended)
  • At least 1 NIC
  • Statically allocated IP address
  • Fully qualified domain name as returned by the hostname command

Host/hypervisor system requirements: The host is where the cloud services run in the form of guest virtual machines. Each host is one machine that meets the following requirements:

  • Must support HVM (Intel-VT or AMD-V enabled)
  • 64-bit x86 CPU (more cores result in better performance)
  • Hardware virtualisation support required
  • 4GB of memory
  • 36GB of local disk
  • At least 1 NIC
  • Latest hotfixes applied to hypervisor software
  • When you deploy CloudStack, the hypervisor host must not have any VMs already running
  • All hosts within a cluster must be homogeneous. The CPUs must be of the same type, count, and feature flags

Installation steps: You may be able to do a simple trial installation, but for full installation, do make sure you go through all the following topics from the Apache CloudStack documentation (refer to the section ‘Installation Steps’ of this documentation):

  • Choosing a deployment architecture
  • Choosing a hypervisor: Supported features
  • Network setup
  • Storage setup
  • Best practices

The steps for the installation are as follows (you can refer to the Apache CloudStack documentation for detailed steps). Make sure you have the required hardware ready as discussed above.

Installing the management server (choose single- or multi-node): The procedure for installing the management server is:

  • Prepare the operating system
  • In the case of XenServer only, download and install vhd-util
  • Install the first management server
  • Install and configure the MySQL database
  • Prepare NFS shares
  • Prepare and start additional management servers (optional)
  • Prepare the system VM template

Configuring your cloud

After the management server is installed and running, you can add the compute resources for it to manage. For an overview of how a CloudStack cloud infrastructure is organised, see ‘Cloud Infrastructure Overview’ in the Apache CloudStack documentation.

To provision the cloud infrastructure, or to scale it up at any time, follow the procedures given below:

1. Define regions (optional)

2. Add a zone to the region

3. Add more pods to the zone (optional)

4. Add more clusters to the pod (optional)

5. Add more hosts to the cluster (optional)

6. Add primary storage to the cluster

7. Add secondary storage to the zone

8. Initialise and test the new cloud

When you have finished these steps, you will have a deployment with the basic structure, as shown in Figure 4.

For all the above steps, detailed instructions are available in the Apache CloudStack documentation.

Figure 4: Conceptual view of a basic deployment

Initialising and testing

After everything is configured, CloudStack will perform its initialisation. This can take 30 minutes or more, depending on the speed of your network. When the initialisation has been completed successfully, the administrator’s dashboard should be displayed in the CloudStack UI.

1. Verify that the system is ready. In the left navigation bar, select Templates. Click on the CentOS 5.5 (64-bit) no GUI (KVM) template. Check to be sure that the status is ‘Download Complete’. Do not proceed to the next step until this message is displayed.

2. Go to the Instances tab, and filter on the basis of My Instances.

3. Click Add Instance and follow the steps in the wizard.

4. Choose the zone you just added.

5. In the template selection, choose the template to use in the VM. If this is a fresh installation, it is likely that only the provided CentOS template is available.

6. Select a service offering. Be sure that the hardware you have allows the starting of the selected service offering.

7. In data disk offering, if desired, add another data disk. This is a second volume that will be available to but not mounted in the guest. For example, in Linux on XenServer you will see /dev/xvdb in the guest after rebooting the VM. A reboot is not required if you have a PV-enabled OS kernel in use.

8. In the default network, choose the primary network for the guest. In a trial installation, you would have only one option here.

9. Optionally, give your VM a name and a group. Use any descriptive text you would like to.

10. Click on Launch VM. Your VM will be created and started. It might take some time to download the template and complete the VM startup. You can watch the VM’s progress in the Instances screen.

To use the VM, click the View Console button.

If you decide to increase the size of your deployment, you can add more hosts, primary storage, zones, pods and clusters. You may also see the additional configuration parameter setup, hypervisor setup, network setup and storage setup.

CloudStack installation from the GIT repo (for developers): See the section ‘CloudStack Installation from the GIT repo for Developers’ in the Apache CloudStack documentation to explore these steps for developers.

The CloudStack API

The CloudStack API is a query based API using HTTP, which returns results in XML or JSON. It is used to implement the default Web UI. This API is not a standard like OGF OCCI or DMTF CIMI but is easy to learn. Mapping exists between the AWS API and the CloudStack API as will be seen in the next section. Recently, a Google Compute Engine interface was also developed, which maps the GCE REST API to the CloudStack API described here.

The CloudStack query API can be used via HTTP GET requests made against your cloud endpoint (e.g., http://localhost:8080/client/api). The API name is passed using the command key, and the various parameters for this API call are passed as key value pairs. The request is signed using the access key and secret key of the user making the call. Some calls are synchronous while some are asynchronous. Asynchronous calls return a JobID; the status and result of a job can be asked with the query AsyncJobResult call. Let’s get started and look at an example of calling the listUsers API in Python.

First, you will need to generate keys to make requests. In the dashboard, go to Accounts, select the appropriate account and then click on Show Users. Select the intended users and generate keys using the Generate Keys icon. You will see an APIKey and Secret Key field being generated. The keys will be in the following form:

API Key : XzAz0uC0t888gOzPs3HchY72qwDc7pUPIO8LxC-VkIHo4C3fvbEBY_Ccj8fo3mBapN5qRDg_0_EbGdbxi8oy1A

Secret Key: zmBOXAXPlfb-LIygOxUVblAbz7E47eukDS_0JYUxP3JAmknOYo56T0R-AcM7rK7SMyo11Y6XW22gyuXzOdiybQ

Open a Python shell and import the basic modules necessary to make the request. Do note that this request could be made in many different ways—this is just a very basic example. The urllib* modules are used to make the HTTP request and do URL encoding. The hashlib module gives us the sha1 hash function. It is used to generate the hmac (keyed hashing for message authentication) using the secret key. The result is encoded using the base64 module.

$python

Python 2.7.3 (default, Nov 17 2012, 19:54:34)

[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin

Type “help”, “copyright”, “credits” or “license” for more information.

>>> import urllib2

>>> import urllib

>>> import hashlib

>>> import hmac

>>> import base64

Define the endpoint of the Cloud, the command that you want to execute, the type of the response (i.e., XML or JSON) and the keys of the user. Note that we do not put the secret key in our request dictionary because it is only used to compute the hmac.

>>> baseurl=’http://localhost:8080/client/api?’

>>> request={}

>>> request[‘command’]=’listUsers’

>>> request[‘response’]=’json’

>>> request[‘apikey’]=’plgWJfZK4gyS3mOMTVmjUVg-X-jlWlnfaUJ9GAbBbf9EdM-kAYMmAiLqzzq1ElZLYq_u38zCm0bewzGUdP66mg’

>>> secretkey=’VDaACYb0LV9eNjTetIOElcVQkvJck_J_QljX_FcHRj87ZKiy0z0ty0ZsYBkoXkY9b7eq1EhwJaw7FF3akA3KBQ’

Build the base request string, which is the combination of all the key/pairs of the request, url encoded and joined with ampersand.

>>> request_str=’&’.join([‘=’.join([k,urllib.quote_plus(request[k])]) for k in request.keys()])

>>> request_str

‘apikey=plgWJfZK4gyS3mOMTVmjUVg-X-jlWlnfaUJ9GAbBbf9EdM-kAYMmAiLqzzq1ElZLYq_u38zCm0bewzGUdP66mg&command=listUsers&response=json’

Compute the signature with hmac, and do a 64-bit encoding and a url encoding; the string used for the signature is similar to the base request string shown above, but the keys/values are lower cased and joined in a sorted order.

>>> sig_str=’&’.join([‘=’.join([k.lower(),urllib.quote_plus(request[k].lower().replace(‘+’,’%20’))])for k in sorted(request.iterkeys())])

>>> sig_str

‘apikey=plgwjfzk4gys3momtvmjuvg-x-jlwlnfauj9gabbbf9edm-kaymmailqzzq1elzlyq_u38zcm0bewzgudp66mg&command=listusers&response=json’

>>> sig=hmac.new(secretkey,sig_str,hashlib.sha1).digest()

>>> sig

‘M:]\x0e\xaf\xfb\x8f\xf2y\xf1p\x91\x1e\x89\x8a\xa1\x05\xc4A\xdb’

>>> sig=base64.encodestring(hmac.new(secretkey,sig_str,hashlib.sha1).digest())

>>> sig

‘TTpdDq/7j/J58XCRHomKoQXEQds=\n’

>>> sig=base64.encodestring(hmac.new(secretkey,sig_str,hashlib.sha1).digest()).strip()

>>> sig

‘TTpdDq/7j/J58XCRHomKoQXEQds=’

>>> sig=urllib.quote_plus(base64.encodestring(hmac.new(secretkey,sig_str,hashlib.sha1).digest()).strip())

Finally, build the entire string by joining the baseurl, the request str and the signature. Then do an http GET:

>>> req=baseurl+request_str+’&signature=’+sig

>>> req

‘http://localhost:8080/client/api?apikey=plgWJfZK4gyS3mOMTVmjUVg-X-jlWlnfaUJ9GAbBbf9EdM-kAYMmAiLqzzq1ElZLYq_u38zCm0bewzGUdP66mg&command=listUsers&response=json&signature=TTpdDq%2F7j%2FJ58XCRHomKoQXEQds%3D’

>>> res=urllib2.urlopen(req)

>>> res.read()

{

“listusersresponse” : {

“count”:1 ,

“user” : [

{

“id”:”7ed6d5da-93b2-4545-a502-23d20b48ef2a”,

“username”:”admin”,

“firstname”:”admin”,

“lastname”:”cloud”,

“created”:”2012-07-05T12:18:27-0700”,

“state”:”enabled”,

“account”:”admin”,

“accounttype”:1,

“domainid”:”8a111e58-e155-4482-93ce-84efff3c7c77”,

“domain”:”ROOT”,

“apikey”:”plgWJfZK4gyS3mOMTVmjUVg-X-jlWlnfaUJ9GAbBbf9EdM-kAYMmAiLqzzq1ElZLYq_u38zCm0bewzGUdP66mg”,

“secretkey”:”VDaACYb0LV9eNjTetIOElcVQkvJck_J_QljX_FcHRj87ZKiy0z0ty0ZsYBkoXkY9b7eq1EhwJaw7FF3akA3KBQ”,

“accountid”:”7548ac03-af1d-4c1c-9064-2f3e2c0eda0d”

}

]

}

}

All the clients you find on GitHub implement this signature technique, so you should not have to do it manually. Now that you have explored the API through the UI and you understand how to make low level calls, pick your favourite client or use CloudMonkey. This is a sub-project of Apache CloudStack and gives operators/developers the ability to use any of the API methods.

Testing the AWS API interface: While the native CloudStack API is not a standard, CloudStack provides an AWS EC2 compatible interface. A great advantage of this is that existing tools written with EC2 libraries can be reused against a CloudStack based cloud. In the installation section, we described how to run this interface by installing packages. In this section, we find out how to compile the interface with Maven and test it with the Python Boto module.

Using a running management server (with DevCloud for instance), start the AWS API interface in a separate shell with the following command:

mvn -Pawsapi -pl :cloud-awsapi jetty:run

Log into the CloudStack UI http://localhost:8080/client, go to Service Offerings and edit one of the compute offerings to have the name m1.small or any of the other AWS EC2 instance types.

With access and secret keys generated for a user, you should now be able to use the Python Boto module:

import boto

import boto.ec2

accesskey=”2IUSA5xylbsPSnBQ FoWXKg3RvjHgsufcKhC1SeiCbeEc0obKwUlwJamB_gFmMJkFHYHTIafpUx0pHcfLvt-dzw”

secretkey=”oxV5Dhhk5ufNowey 7OVHgWxCBVS4deTl9qL0EqMthfP Buy3ScHPo2fifDxw1aXeL5cyH10hnLOKjyKphcXGeDA”

region = boto.ec2.regioninfo.RegionInfo(name=”ROOT”, endpoint=”localhost”)

conn = boto.connect_ec2(aws_access_key_id=accesskey, aws_secret_access_key=secretkey, is_secure=False, region=region, port=7080, path=”/awsapi”, api_version=”2012-08-15”)

images=conn.get_all_images()

print images

res = images[0].run(instance_type=’m1.small’,security_groups=[‘default’])

Note the new api_version number in the connection object, and also note that there was no need to perform a user registration as in the case of previous CloudStack releases.

Let us thank those at Apache for contributing yet another outstanding product to the open source community, along with the detailed documentation they have provided for CloudStack. All the contents, samples and pictures in this article are extracts from CloudStack online documentation and you may explore more about it at http://docs.cloudstack.apache.org/en/latest/.

LEAVE A REPLY

Please enter your comment!
Please enter your name here