There are various challenges associated with management of services from multiple clouds. These include the variety of security services, operational tools, monitoring services and inconsistency between cloud platforms for core services like compute, storage and networking. However, there are a few open source cloud management tools that help to address these challenges.
Organisations prefer to use multi-cloud management solutions for a number of reasons.
1. During cloud migration, depending on the tech stack of applications and the migration approach (re-factor instead of re-architect), a multi-cloud solution can help as each cloud provider can enable different applications to be migrated faster.
2. During mergers and acquisitions, different organisations have application workloads with different cloud providers, and hence a multi-cloud environment becomes a natural choice for short-term readiness. Later, for uniformity of organisational policies and a unified platform, an organisation can move to a single cloud provider (strategic cloud platform).
3. It offers availability across multiple cloud service providers, which reduces risks associated with security, scalability and availability.
4. Multi-cloud management is a great option for disaster recovery (DR) sites, where one cloud provider acts as the primary site and another acts as a secondary one to ensure high-availability of applications at any given time.
When large enterprises that span across multiple geographies want to migrate to the cloud, choosing a single cloud service provider is a big challenge due to the geographical diversity and related reasons. In such a scenario, choosing a multi-cloud strategy is the natural choice for enterprise cloud adoption. As per a Gartner survey, about 81 per cent of large enterprises are using more than one cloud service provider and there are multiple reasons for that, such as:
1. Avoiding vendor lock-in
2. Cost optimisation for apps depending on the geographical diversity
3. Usage of native services for different applications provided by different cloud service providers
According to a 2020 report from Gartner, cloud management comprises seven functional areas:
- Infrastructure and service provisioning, and orchestration
- Service enablement and faster adoption to cloud services
- Inventory and classification of cloud services based on complexity and priority
Identity, security and compliance to handle application/data/platform security — a compliance framework like SOC, SOX, HIPAA, etc, depending on local compliance requirements - Cloud migration, backup and disaster recovery to enable faster cloud adoption and transformation to the target cloud environment
- Monitoring and observability to enable support functionalities and cloud management services, including native and third party monitoring integration
- Cloud management and resource optimisation to enable better total cost of ownership (TCO) and effective return on investment in cloud adoption.
Reference architecture of multi-cloud management
When enterprises begin migrating to the cloud, they tend to move slowly so that they can recover quickly in case the migration strategy fails. Hybrid cloud management plays a vital role here. A hybrid cloud has architecture designs of two different types – legacy applications remain on-premises due to data sensitivity or other reasons, and modern applications are moved to the cloud.
Keeping legacy applications on-premises and making them communicate with the applications in the cloud using APIs or asynchronous message can be one design approach. In another approach, on-premises applications, as well as the cloud infrastructure and platform are set up as close as possible in order to have seamless integration between them.
Appliance services in AWS, MS Azure and GCP (Google Cloud Platform), like AWS outposts, Azure Stack or Google Anthos, can help with the second approach. They help in tight integration of services in hybrid cloud environments, and also provide a centralised infrastructure management and monitoring facility. They are helpful for easy lift-and-shift of on-premises applications to a public cloud at a later point in time.
Google Anthos is powerful as it supports a hybrid cloud as well as a multi-cloud environment with an agnostic design. Azure enables seamless integration of services between on-premises and public cloud applications, while AWS outposts help by quickly provisioning an infrastructure setup by the cloud service provider itself.
Tool/Feature | GCP Native Services |
Visualisation report for cost and utilisation reports | Cost management provides visualisation reports on utilisation, filtered by labels. Data Studio is used to build custom dashboards. |
Integration with third party and native tools | Billing API and data collection APIs (StackDrive, Resource Manager API) are used for integration with native and third party cost management tools. |
Predictive analysis for cost advisory | Intelligent recommendations optimise cost and usage based on patterns of resource usage. Resource hierarchy is used for fine grained resource management for cost allocation. |
Infra optimisation | Quota limits are used for proactive control of the spend rate on resources including apps and infrastructure. |
Alert on usage patterns | Alerts with automated budget actions are used to throttle resources and cap costs. |
Rule driven event handling | Alerts can be sent through events (SMS, email) using programmatic budget notification. |
DevOps pipeline (native) integration | Can be integrated with Cloud Build, Cloud Pub/Sub (event/alerts) and cloud functions to invoke cost reporting, billing and notification. |
Service now integration for provisioning activities (e.g., Datalink) | Cloud Connector app and StackDriver API can be used to integrate with ServiceNow. |
Agnostic cloud management solutions
Though there are many multi-cloud management tools available in the market like Apptio Cloudability, CloudHealth, Matilda, Densify, and CloudChecker, to name a few, there are many important features that are covered in native services itself. For example, GCP has cloud management services for reporting, dashboard, API integration, datalink, budget and alerts, and advisory services integrated together.
If you are looking for an agnostic cost management service for multi-cloud scenarios, you can go for third party tools like Apptio Cloudability in order to have a uniform management policy, strategy and services across different cloud environments.
On the other hand, when you are looking for lower capital expenditure (CAPEX) to avoid licensing costs for third party cost management tools and want to get quickly into cost management, then you can choose a GCP native management tool or any native management service.
Cloud service expense management (CSEM), also called FinOps, is a new-age requirement from cloud service platforms that helps in management of cloud service costs. Typically, there is transparency with respect to cloud service usage patterns. Various manual and automated facilities are now offered by cloud service providers to manage the costs associated with their services, as well as enable cost management through budgets, alerts and platform service provisioning.
Cloud service platforms like MS Azure, AWS and GCP provide native cloud management facilities like budgets, alerts, cost advisories, and workflow management.
Native cloud management vs cloud agnostic solutions
Gartner’s guidance framework for selecting cloud management platforms and tools helps with cross-platform solutions for a multi-cloud or hybrid cloud approach, as well as with a platform-specific approach for native cloud adoption.
Infrastructure as Code (IaC), as well as cloud automation for provisioning and virtualisation needs, is very helpful in accelerating the building and deployment of cloud services. Preparing provisioning scripts for each cloud is inefficient, and IaC helps to automate these across different cloud environments with ease.
Though cloud native facilities like CloudFormation in AWS, ARM templates in Azure and CDM templates in GCP, enable IaC, they are mostly native to the cloud platform and don’t have the agility and flexibility for multi-cloud management. On the other hand, services like Google Anthos, HashiCorp Terraform, and Red Hat Ansible can help with multi-cloud management, developing zero-touch deployment scripts and provisioning templates for setting up cloud towers.
Google Anthos, HashiCorp Terraform and Red Hat CloudForms are leading this space for creating, changing and provisioning infrastructure for any cloud platform including service management, security and compliance, and optimisation. Red Hat CloudForms has five integral components, i.e., CloudForms management engine appliance, CloudForms management engine server, virtual management database, CloudForms management engine console and smart proxy.
Challenges in multi-cloud management
A multi-cloud approach does have complexity and compatibility issues, as well as challenges with respect to security and monitoring.
1. High complexity: One of the hidden costs of multi-cloud computing is high complexity. Every cloud service provider has its own set of conventions, commands and ways of performing things. As an example, AWS has three load balancers while Google Cloud has six, and each performs different aspects of load balancing. Some work at Layer 7 and some at Layer 4, and deciding how to effectively balance load and keep applications secure is one of the complex issues to solve. Doing this across multiple clouds also increases complexity.
2. Varied service capabilities across clouds: Different clouds have different capabilities even with respect to offerings like computing, storage and networking. For example, in theory, Kubernetes should be managed in the same manner across all clouds. In practice, each major cloud provider has key differences in how it manages Kubernetes, the monitoring and security tools for these managed offerings, and performance levels. So even when a service is, on the face of it, exactly the same and built on the same core technology, there may be key differences underneath that can have a real impact on performance, resilience, and how you architect an application.
3. Cost management: The most difficult and time consuming aspect to understand in multi-cloud services is cost management. Cloud providers charge end users on the basis of computing, storage, networking, region, data transmission and other parameters. Some cloud providers not only charge for the size of the load balancer but also for the type of instance – spot or reserved, how many requests an instance will handle per server per second, whether the load balancer will move data from one region to another, and how many rules a load balancer can apply to move the data. Charges for all of these aspects vary from cloud to cloud, and definitions of services do not even match up perfectly. For example, in GCP, there are different tiers for networking, which do not exist in Azure. Managing and forecasting costs in one cloud is also challenging but often made more addressable by the providers’ management and projection tools.
4. Prolonged time on application changes: Deployment of new applications on multiple clouds can even slow down features and functionality as we need more time for testing the new changes in operational cloud environments. Sometimes, containerised applications perform differently on different clouds, and for mission-critical applications, we need more time and budget to test the performance.
5. Security risks and attacks: In multi-cloud environments, security teams need to monitor three times more services in the cloud, which sometimes leads to mistakes and opens gates for hackers to attack. Security teams will also need to configure and test 2x or 3x more security appliances and tools. This creates a lot more stress on already overloaded security teams, and increases the likelihood of human error resulting from a misconfiguration or a missed update. DevOps teams dealing with multiple clouds may get frustrated with the complexity, and create workarounds that increase the attack surface area and add risk. In addition, data moving between clouds means more exposure and an increased attack surface.
With a multi-cloud strategy, you can benefit from the best that each cloud has to offer. But this comes with its own set of challenges and considerations. That’s why it is desirable to evaluate and identify the right set of management tools, especially for security and compliance.
Open source multi-cloud management tools
The following are the best open source multi-cloud management tools.
Mist
Mist attempts to make multi-cloud management simple and offers a single interface from where you can manage everything. It supports all popular infrastructure technologies including public clouds, private clouds, hypervisors, containers and bare metal servers. It provides a unified interface for performing common management tasks like provisioning, orchestration, monitoring, automation and cost analysis. It comes with a RESTful API and CLI so you can easily integrate it into your existing workflows.
Features:
- Instant visibility of all the available resources across clouds, grouped by tags
- Instant reporting/estimation of the current infrastructure costs
- Compare current and past costs, correlate with usage, provide right-sizing recommendations (EE/HS only)
- Provision new resources on any cloud: machines, volumes, networks, zones, records
- Perform life cycle actions on existing resources: stop, start, reboot, resize, destroy, etc
- Instant audit logging for all actions performed through Mist or detected through continuous polling
- Upload scripts to the library, run them on any machine while enforcing audit logging and centralised control of SSH keys
- SSH command shell on any machine within the browser or through the CLI, enforcing audit logging and centralised control of SSH keys
- Enable monitoring on target machines to display real-time system and custom metrics, and store them for long-term access
- Set rules on metrics or logs that trigger notifications, webhooks, scripts or machine life cycle actions
- Set schedules that trigger scripts or machine life cycle actions
- Set fine grained access control policies per team/tag/resource/action (EE/HS only)
- Set governance constraints; e.g., quotas on cost per user/team, required expiration dates (EE/HS only)
- Upload infrastructure templates that may describe complex deployments and workflows (EE/HS only)
- Deploy and scale Kubernetes clusters on any supported cloud (EE/HS only)
Official website: https://mist.io/
Latest version: 4.5.5
Cloudify
Cloudify is a pure-play, standards based (TOSCA) cloud orchestration platform that supports every major private and public cloud infrastructure offering. With Cloudify, enterprises can use a single, open source cloud orchestration platform across OpenStack, VMware or AWS clouds, with virtualisation approaches such as VMs or containers, and with different automation tool sets like Puppet, Chef or SaltStack. Because it provides an easy-to-use, open source tool for management and orchestration (MANO) of multiple clouds, data centres and availability zones, Cloudify is attractive to telecoms, Internet service providers, and enterprises using hybrid cloud.
Features:
- Enterprise-grade enhanced hybrid cloud support: It supports all major public and private cloud environments including AWS, Azure, GCP, OpenStack, and VMWare vSphere and vCloud.
- Support for entire VMware stack: Cloudify is the only open source orchestration platform supporting the entire VMware stack; all VMware plugins are open source and available in the Cloudify Community edition.
- Public shared images for both AWS and OpenStack: Prebaked Cloudify Manager environments are now available for AWS through a shared AMI, and OpenStack through a QCOW image. This enables simple bootstrapping of a full-fledged Cloudify environment in minutes.
- Deployment update: This allows updating of application deployments, enabling application operations engineers and developers to introduce topology changes and include new resources to run TOSCA deployments.
- In-place manager upgrade: The new Cloudify Manager upgrade process provides fully automated in-place upgrades for all manager infrastructure without any downtime to the managed services; in-place upgrade allows easy migration between Cloudify versions and application of patched versions
Official website: https://cloudify.co/
Latest version: 3.4
ManageIQ
ManageIQ delivers the insight, control, and automation that enterprises need to address the challenges of managing hybrid IT environments. It allows you to understand the current state of your environment, provide self-service for end users, and enforce compliance policies.
Features:
- Insight: Discovery, monitoring, utilisation, performance, reporting, analytics, chargeback, and trending.
- Control: Security, compliance, alerting, policy based resource and configuration management.
- Automate: IT process, task and event, provisioning, workload management and orchestration.
- Integrate: Systems management, tools and processes, event consoles, CMDB, RBA, and Web services.
Official website: https://www.manageiq.org/
Latest version: 4.2.0
OpenNebula
OpenNebula is a cloud computing platform for managing heterogeneous distributed data centre infrastructures. The OpenNebula platform manages a data centre’s virtual infrastructure for private, public and hybrid implementations of Infrastructure as a Service. The two primary uses of the OpenNebula platform are data centre virtualisation and cloud deployments based on the KVM hypervisor, LXD system containers, and AWS Firecracker microVMs. The platform is also capable of offering the cloud infrastructure necessary to operate a cloud on top of existing VMware infrastructure.
OpenNebula orchestrates storage, network, virtualisation, monitoring, and security technologies to deploy multi-tier services (e.g., compute clusters) as virtual machines on distributed infrastructures, combining both data centre resources and remote cloud resources, according to allocation policies.
Features:
- Simple, clean, intuitive GUI for users and admins, with different views
- Easy self-provision of containerised and virtualised workflows from a catalogue
Fine-grained accounting and monitoring - Build your private marketplace to share and distribute applications within your organisation
- Dynamic creation of clusters as pools of hosts
- Dynamic creation of virtual data centres as fully-isolated virtual environments
- Federation of multiple zones for scalability, isolation or multiple-site support
- Powerful and flexible scheduler — deploy your workload in different locations
Powerful user, group and role management - Integration with enterprise and open source user management services
Official website: https://opennebula.io/
Latest version: 6.0