The tremendous potential of cloud isn’t news. It’s already transformed the way we store, access and use our data and is forming the foundation of a new era of innovation across a huge range of industries and areas of life. On a business level, cloud computing is also bringing huge business benefits, allowing enterprises from start up to multinational reduce both Capex and Opex costs and shift emphasis in the IT department from infrastructure to business enablement.
For many enterprises, however, cloud is not yet a ‘plug ‘n’ play’ infrastructure. Businesses are looking to make applications available in the cloud for both cost benefits and flexibility to their employees. The main concern in doing this is guaranteed availability of the app. Some applications such as web services, scalable NoSQL stacks like MongoDB and stateless applications are perfect for cloud. They can be orchestrated, need little maintenance, and can handle failure. Many businesses, however, rely on specific apps that are not so cloud ready, unable to naturally scale without straining the application and unable to recover from failures. As cloud adoption grows, IT departments increasingly find themselves building on these apps to prepare them for integration into the cloud stack at great time and expense, but these apps continue to lack intelligence in how to respond in the event of failure.
While monitoring solutions designed to predict failures are available for OpenStack, they often target specific issues and serve to warn, not respond to possible issues. For the enterprise, this isn’t enough. There can be failure at every level of the stack – infrastructure, guest OS, application. If an app isn’t able to failover, failure at any level is an outage for the app and lost time and money for the enterprise. Working with Intel and Mirantis, we’ve looked to create a platform that can be built in to OpenStack to eliminate the stress and risk of apps without native failover functionality.
A major challenge with cloud is to centrally monitor the hardware being used with the changing load on the system. This is vital to harness the power of existing hardware to maximize the efficiency of cloud infrastructures and assure availability. At this year’s OpenStack Summit, we will present a platform that allows OpenStack users to monitor and automate corrective measures to keep their cloud stack running at optimum performance, whether native or not.
With this solution, we set ourselves the goal of providing enterprises with the ability to monitor key metrics and auto-remediate with predictive failure analysis – both reactively and proactively. To do that, we’re employing Zabbix to pull monitoring data from the compute, setting triggers, and using Nova scheduler to determine new VM placements. With this, we can see everything from compute node crashes to a server’s thermal footprint and performance degradation, and automatically respond. Enterprises can set rules to automatically cover failure at any level of the stack from moving loads from overheated servers to the complete offload of workloads from crashing nodes, before that data becomes unattainable. All of this is designed to work across multiple frameworks and at scale.
With this functionality built into the stack, we’re hoping we can help enterprises drive down the cost and time requirement of putting new, non-native apps into the cloud, grow confidence in the reliability of cloud apps and spur even greater adoption of OpenStack as a flexible, low cost approach to the cloud stack.
Prateek and Pramod will be presenting with Intel and Mirantis at the OpenStack Summit in Austin on 25 April. Catch their talk – Intelligent Workload HA in Openstack – at 12:05pm, Austin Convention Center – Level 4 – Ballroom D.