Cloud monitoring tools collect data and illustrate patterns that might be difficult to spot otherwise in a dynamic infrastructure environment, in which services are provided by Cloud Service Providers. Read on to know more…
Organisations are keen to leverage cloud computing to improve the agility and scalability. Conversely, those that have adopted cloud computing face the following challenge: decreased visibility into the performance and governance of services being delivered to their end users.
Cloud Service Providers (CSP) do provide dashboard facilities to track the status of their services; in addition, they provide alerts and notification mechanisms to recognise and report service outages in a timely manner. Customers also need to know the status of the applications on the cloud, and hence the need for continuous monitoring. Cloud monitoring refers to the monitoring of the performance of physical or virtual servers, storage systems, networks, and the applications running on them. Cloud monitoring tools can collect data and illustrate patterns that might be difficult to spot otherwise. To maintain high availability of applications, cloud monitoring tools can be used to collect metrics, gain insights and perform corrective measures (in case required) to guarantee business continuity.
Monitored metrics
Cloud monitoring is essential for controlling and managing cloud resources and software infrastructure; it provides key performance indicators (KPI) for infrastructure, platforms, and applications.
Given below is a list of parameters that can be monitored by various open source tools, if not by a single solution.
- Application and cloud server response time
- Number of concurrent users
- Application and cloud server availability
- Network Latency
- Cloud service utilisation
- Overall bits/sec and requests/sec served by all of the processes
- Response time for specific transactions
- Memory usage
- Disk usage
- CPU utilisation
- System load
- Network interface activity
- Database activities
- Swap space
- Performance of attached disk
Why cloud monitoring is essential
Cloud monitoring is necessary for a number of reasons. The more important ones are listed below.
Security: Security in the cloud is a major roadblock in cloud adoption for business critical applications and in certain industries in which data security is extremely vital.
SLA management: Due to the dynamic nature of the cloud, continuous monitoring on QoS attributes is essential to implement SLAs; and the multifaceted nature of the cloud landscape demands a refined means of managing SLAs.
Capacity planning: Cloud monitoring tools help to identify what resources you are using in the cloud. The performance and capacity requirements can be assessed and, accordingly, resources can be scaled up and down to effectively achieve performance levels and satisfy customers.
Resource management: Resource management is crucial in cloud computing. It maximises resource utilisation and minimises the total cost of both the cloud infrastructure and application hosting.
Trouble shooting: Root cause analysis is a challenging area in cloud computing, considering the involvement of various components and the complex architecture. Cloud monitoring tools can help to diagnose and rectify an issue.
Performance management: Cloud performance management is the monitoring of resources that manage application performance in cloud environments. It includes supervision of applications to maintain optimal performance and availability.
Billing: Monitoring is a very basic requirement to provide measured services in the cloud environment for CSPs and cloud consumers.
Open source tools for cloud monitoring
Zenoss
Zenoss is an open source platform released under the GNU General Public License (GPL) version 2. It provides an easy-to-use Web UI to monitor performance, events, configuration, and inventory. Zenoss is one of the best for unified monitoring since it is cloud agnostic and is open source. Zenoss provides powerful plug-ins named Zenpacks, which support monitoring on hypervisors (ESX, KVM, Xen and HyperV), private cloud platforms (CloudStack, OpenStack and vCloud/vSphere), and public cloud (AWS).
Hyperic
SpringSource is a division of VMware that has acquired Hyperic, Cloud Foundry, RabbitMQ, and Gemstone. Hyperic can be used for the auto-discovery of all components of virtualised applications. It automatically discovers, manages and monitors IT and network resources on the private cloud (VMware) and public cloud (Amazon Web Services). It is optimised for virtual environments with integration with vCenter and vSphere.
Hyperic provides open source IT resource and network monitoring application software. It auto discovers system resources such as operating systems, hardware, databases, middleware, applications, virtualisation and services. It provides monitoring of network services (SNMP, SMTP, HTTP, and ICMP) and host resources (processor load, disk usage and system logs); supports remote monitoring; enables auto-discovery of system resources, identifies problems, performs root cause analysis, and offers security through access control.
Ganglia
Ganglia is a monitoring tool for high performance computing systems such as private clouds, public clouds, clusters, and grids. The Ganglia system contains: 1) two unique daemons, 2) a PHP-based Web front-end, and 3) other small programs. Gmond, a multi threaded daemon, runs on each node to monitor changes in the host state, announce applicable changes, listen to the state of all Ganglia nodes via a unicast or multicast channel based on installation, and respond to requests. At regular intervals, Ganglia Meta Daemon polls a collection of data sources, parses the XML, saves all metrics to round-robin databases and exports the aggregated XML. The Ganglia Web front-end is written in PHP and uses graphs generated by gmetad, and provides the collected information like CPU utilisation for the past day, week, month, or year.
Multicast mode is the default setting in Ganglia installation and is the simplest to set up, providing redundancy. Public Cloud Environments such as Amazon’s AWS EC2 do not support multicast, so unicast mode installation is the only set-up option available. Eucalyptus is an open source product for building AWS compatible private clouds; its open source version does not provide built-in monitoring but that can be achieved with Ganglia.
The Eucalyptus source package includes scripts that can be used with third party tools such as Ganglia to enable Eucalyptus-specific monitoring on a pre-defined number of hosts.
Nagios
Open source network monitoring and infrastructure monitoring tool Nagios provides monitoring and reporting for network services and host resources. Nagios Core is an open source, infrastructure monitoring system that enables organisations to diagnose IT infrastructure problems before they have an adverse effect on critical business processes.
Nagios provides monitoring of Cloud Resources, such as compute, storage and network services. Nagios is capable of monitoring virtual servers and OSs in both the physical and virtual environment. With Nagios, it is easy to identify issues in the cloud environment, detect network outages and check application availability.
The Nagios cloud monitoring tool offers multiple benefits such as high availability, fault tolerance and data availability. Nagios provides monitoring of public cloud services such as Amazon EC2 (Elastic Compute Cloud), Amazon S3 (Simple Storage Service), etc. Eucalyptus is a product for building private and hybrid clouds; its open source version does not provide built-in monitoring but that can be achieved with Nagios, using scripts in the Extras directory, along with third party tools, for example, to interact with Nagios. Nagios allows you to write plug-ins in just about any language and run them on remote servers.
OpenNMS
OpenNMS is a network management platform which also has an open source model. It targets organisations that need scalable network management. OpenNMS supports SNMP natively, as well as common service checks. It is a new enterprise grade monitoring system. It employs a console that allows the OpenNMS daemon to communicate network status updates with the front-end engine in real time.
Zabbix
Zabbix is an open source network monitoring tool that can be used to automatically collect and parse data from monitored cloud resources so that administrators can verify the availability and see trends in network performance. It also provides distributed monitoring with centralised Web administration, a high level of performance and capacity, JMX monitoring, SLAs and ITIL KPI metrics on reporting, as well as agent-less monitoring.
References
[1] http://www.zenoss.com/about/news/zenoss_core_2_5_0.html
[2] http://blogs.the451group.com/opensource/2010/04/07/cloud-monitoring-keeps-open-source-in-cool-crowd/
[3] http://www.soa.si/2011/07/25/monitoring-eucalyptus-cloud-with-nagios-and-ganglia/
[4] http://searchcloudcomputing.techtarget.com/report/Cloud-management-tools-guide-for-beginners
[5] http://en.wikipedia.org/wiki/Comparison_of_network_monitoring_systems
As the cloud technology is developing more and more tools will appear which will either use cloud as a base or which will monitor the cloud services. The only problem is find a reliable tool which really deserves its price and the options. It took me a lot of time till I got what was necessary. I chose Anturis (http://www.anturis.com), a cloud-based solution to monitor the whole company IT infrastrcture.