Sep 27 2009

Virtualization: Solution or Problem?

Category: Cloud ComputingSurendra Reddy @ 6:08 pm   Comments (6)

Is virtualization solution to a problem or part of the problem?

Christofer Hoff ignited the create spark Virtual Machines are the Problem, Not the Solution.

In my view and experience, Virtualization is part of the problem as well as part of the solution. While automation is the key in fulfilling end-to-end service delivery, virtualization is a necessary technology. However, current architectural style of service composition, delivery, and management is mired with problems, workarounds, and band-aids which makes the SLA driven end-to-end service delivery just a promise not the fulfillment. We should stop dishing out nodes to the development. Should stop pushing ACLs into switches. We should stop accessing OS primitives from applications. We should stop writing communication patterns into applications. A well defined abstraction and framework on top of Virtualization is essential to make this happen. We can’t ignore the change, configuration, and security management. Simply put it, push-button delivery of services into Cloud securely, reliably, and rapidly. As Christofer Hoff (@beaker) suggested on his blog Rational Survivability, JEOS is first step in that direction.

Let me share my view on why virtualization is part of problem first and then explain why it is also important for End-to-End service delivery.

Why it is Part Problem?

“Geometric complexity” of systems is a (if not the) major contributor to the costs and stability issues we face in our production environments today. In this context, complexity is introduced by the heterogeneity and variations of “OS” needs per application and underlying components (like databases, network, and security etc). These unmanageable or incomprehensible numbers of variations of the Operating Environment makes it hard to understand and optimize our compute infrastructure. We continue to invest our scarce resources to keep this junk alive and fresh all the time. More importantly, 70% of service outages today is caused by configuration or patching errors.
Christofer Hoff (@beaker) puts it very well,

there’s a bloated, parasitic resource-gobbling cancer inside every VM.

I was hopeful and optimistic that would change the way applications designed and delivered. Rich application frameworks like J2EE, Spring, Ruby etc evolved but Operating Environment evolved into one big, monolithic, generalized OS making it impossible to track what is needed and what is not. Adding to this brew, mind boggling number of open sources libraries and tools crept into OS. Though Virtualization provided an opportunity to help us correct these sins but in the disguise of virtualization we started to commit more sins. Sadly, instead of wiping out the cancer bits in the operating environment, all the junk packaged into VMs.

Christofer Hoff (@beaker) raised very thought provoking and stimulating question:

If we didn’t have resource-inefficient operating systems, handicapped applications that were incestuously hooked to them, and tons of legacy networking stuff to deal with that unholy affinity, imagine the fun we could have. Imagine how and flexible we could become.

This is very true. We have too much of baggage and junk inside our operating environment. That has to change. It is not the question of VMWARE, XEN, Parallels or Linux, Open Solaris or FreeBSD. We need paradigm shift in the way we architect and deliver “services”.

Sam Johnston (@samj ) pointed out,

I agree completely that the OS is like a cancer that sucks energy(e.g., resources, cycles), needs constant treatment(e.g. patches, updates, upgrades) and poses significant risk of death(e.g. catastrophic failure) to any application it hosts.

Yes, Sam is correct in his characterization or assertion of “Malignant OS”.

Why Virtualization is important?

@JSchroedl @AndiMann @sureddy Sounds like we’re all in virtual agreement: Not just virtual servers, or even virtual systems, but “Services” end-to-end.

End to End Service Delivery: My sense of virtualization is that it provides an abstraction to absorb all low-level variations, exposing a much simpler, homogeneous environment. While this is not sufficient to help us deliver the automation needed for End to End Service delivery, it is a necessary technology. Applications/Services won’t be exposed to the variations in our operating environment; instead, they will be exposed to a service runtime platform (call it “container” for lack of a better word) with uniform behavioral characteristics and interfaces (please note that “container” is not VM, it is much higher level abstraction that orchestrates hypervisors and operating environments isolating all intricacies of virtualization and operations management etc). We won’t need to qualify an innumerable combination of hardware, OS’s, and software stacks. Instead, the Container layer will be the point of qualification on both sides: each new variation of hardware will be qualified against a single Container layer, and all software will be qualified (quite literally, providing a fast lane change mechanisms development, test, staging and production (Continuous Integration & Continuous Deployment) against that same Container layer. This is really big deal. It helps us to innovate and roll out new services much faster than before. Virtualization plays important role in fulfilling the end-to-end service delivery.
Christofer Hoff(@beaker) pointed out,

VMs have allowed us to take the first steps towards defining, compartmentalizing, and isolating some pretty nasty problems anchored on the sins of our fathers, but they don’t do a damned thing to fix them. VMs have certainly allowed us to(literally) think out-side the box about how we characterize workloads and have enabled us to begin talking about how we make them somewhat mobile, portable, interoperable, easy to describe, inventory, and in some cases more secure. Cool.

Configurations vs. Customizations: Virtualization also absorbs variations in the configurations of physical machines. With virtualization, applications can be written around their own, long-lasting “sweet spots” of services configurations that are synthesized and maintained at the container.

Homogeneity: The homogeneity afforded by virtualization extends to the entire software-development lifecycle. By using a uniform, virtualized serving infrastructure throughout the entire process, from development, through QA, all the way to deployment, we can significantly accelerate innovation and eliminate complexities, and reduce or eliminate incidences that inevitably arise from when the dev and QA environments differ from production.

Mobility: Software mobility to easily move software from one machine to another will greatly relax our SLAs for break-fix (because the software from a broken node can automatically be brought up on a working node), and that in turn reduces the need to physically move machines (because we can move the software instead of moving the machines).

Security Forensics: When an app host is to be decommissioned, virtualization presents the opportunity to archive the state of the host for security forensics, and to securely wipe the data from the decommissioned host using a simple, secure file-wipe rather than a specialized, hard-to-verify bootstrap process. In sum, VMMs provide a uniform, reliable, and performant API from which we can drive automation of the entire host life cycle.

Horizontal Scalability: Virtualization drives another very interesting and compelling architectural paradigm shift. In the world of SOA and global serving with unpredictable workload, we are better off running service tier(my view of tier is load balanced cluster of elastic nodes) across a larger number of smaller nodes, versus a smaller number of larger nodes. Large number of smaller nodes provides cost as well as horizontal scalability advantages. In addition, with a larger number of smaller nodes, when a node goes out, the remaining nodes can more easily absorb the spike in workload that results and new nodes can added or removed in response to workloads.

Eliminate Complex Parallelism: My experience with multi-processing systems(SMP) has shown that effectively scaling software beyond a few cores requires specialized design and programming skills to avoid contention and other bottlenecks to parallelism. Throwing more cores at our software does not improve performance. It is hard to build these specialized skills to develop well-tuned SMP and indeed becoming a great inhibitor to innovation in building scalable services. By slicing large physical servers into smaller, virtual machines we can deliver more value from our investment.

Cloud and Virtualization

@JSchroedl: PRT @AndiMann: HV = no more than hammers PRT @sureddy: Virt servers don’t matter.Cloud is a promise “Service” is what counts

Cloud is a promise and Service is the fulfillment. The goal of the cloud is to introduce an orders-of-magnitude increase in the amount of automation in IT environment, and to leverage that automation to introduce an orders-of-magnitude reduction in our time-to-respond. If a machine goes down (I should stop referring to machines any more – instead I should start emphasizing SLAs), automatically move its workload to a replacement—within seconds. If load on a service spikes or SLAs deviate from the expected mean, auto-magically increase the capacity of that service—again, within seconds.

Hypervisors (virtualization) are as necessary as hammers but not sufficient. What is needed is “End-to-End Service delivery. There is no doubt in my mind that IT is strategic to the business and if properly aligned with business goals, IT can indeed create huge value. Automation and End-to-End service delivery are key drivers for transforming current IT to more agile and responsive IT.

Physical machines do not provide this level of automation. Neither the bloated VMs containing the cancerous OS images. What we need a clean separation of Base Operating system (uniform across cloud), Platform specific components/bundles, and then application components/configurations. While it is impossible to rip and replace existing IT infrastructure, this layered approach would help us to gradually move toward more agile service delivery environment.

TAGS: ,


Sep 25 2009

Government IT and Cloud Computing

Category: Cloud ComputingSurendra Reddy @ 5:12 am   Comments (1)

Government plans and their commitment for cloud computing seems very promising. I do certainly appreciate and congratulate the government leaders and their courageous and bold steps in driving the Cloud adoption. During the times of crisis, we need innovations like this. Recent announcement by Vivek Kundra to source services from the public cloud is definitely an attractive model but there are many challenges below the surface.

My sense is that many government workloads need to run on a controlled environment and their users demand greater degree of control. There may be many bumps on their way to use the public clouds due to their existing assets or contracts or due to data security and access challenges. They may have to run their applications or services in a “private cloud” for a while. Then the bigger issue is how to peer, monitor, and manage the “private cloud” infrastructure across many agencies owned assets and/or including resources from outside the government agencies. Even in the private industries, we face many daunting challenges with existing environments; issues with software licenses, already committed support/infrastructure contracts, hardwired applications and security and access control nightmares across different data centers. I am not sure how easy it is to transition the legacy and more complex, government-owned infrastructure to a “private cloud”? Then comes the much bigger challenge: how successful they can be in establishing the governance of a private cloud infrastructure involving several agencies?

Nonetheless, there is a tremendous amount of excitement, interest, and opportunities around the Cloud Computing. To keep this wave of innovation in IT transformation moving forward, there are many issues that need to be addressed. It is time for all private industries and government to come to the aid of working together to define interoperable, secure cloud-serving infrastructure.

Padmasree Warrior, Cisco CTO, writes on her blog, “We are already working with a variety of organizations to build what we call private clouds. Private clouds combine a cloud operating system with Cisco’s cloud internetworking technology portfolio to link agency and service provider resources into a single agency-managed cloud environment. This cloud is then available to any device, anywhere via standard TCP/IP networking technologies. Importantly, the cloud also gives IT the ability to reach out and leverage the resources of cloud service providers. Private Clouds fundamentally change the dynamic between IT and the rest of the organization by reducing inefficiencies and increasing the rate of business innovation”.

Collective innovation like this would help us move forward and together we all could create and claim huge value from this opportunity.


Sep 16 2009

Do Standards Stifle Innovation?

Category: Cloud ComputingSurendra Reddy @ 2:21 pm   Comments (1)

Attended federal cloud computing announcement today at NASA. White House CIO, Vivek Kundra, revealed ambitious plan for overall government adoption of Cloud computing and simplification of complex IT procurement process. I also attended private roundtable with Mr. Kundra along with folks from Microsoft, Google, Amazon, IBM, Cisco, SGI, Sun, Eucalyptus, Verari, Symantec, and Salesforce.

Mr. Kundra posed an interesting challenge to the Roundtable participants asking what are the top three things industry recommends to him for Federal Cloud success.  Recommendations are: (1). Use focused approach in migration to Cloud based services. Categorize and prioritize. Examples EMAIL cloud, Web Cloud, HPC Cloud etc. Knock off low hanging fruits first. Create success stories.(2) Learn form Internet success and define Simple Standards (3) Market success stories to engage new adoptions. While most of us in the room agree on the need for simple standards (Sergey Brin emphasized the word “simple” and I completely agree with him. Simple is beautiful. Let innovators add more flesh) for interoperable cloud services, Amazon expressed concern that standards would stifle innovation. EC2 is good for dishing out nodes. It quenched the thirst for getting nodes up and running quickly. API is one small piece of the big puzzle. Cloud is <b>not</b> about just virtualization nor dishing out nodes on demand. It is a paradigm shift in the way services are designed, developed, and delivered. Internet success came from very simple protocol called TCP/IP. Standards in fact sparked great deal of innovation(take TCP/IP, HTTP, WebDav, SMTP) to transform the Internet. If companies need to stay relevant, then they need to be open(truly open) and standards based. Do you think Amazon can sustain their first mover advantage forever?

TAGS: , ,


Sep 12 2009

Cloud Computing and Governance

Category: Cloud ComputingSurendra Reddy @ 6:02 pm   Comments (0)

Cloud computing is a major technological paradigm shift after the Internet. While Internet provided the high-speed inter-connects across global digital villages, now cloud computing is transforming the way we serve information, knowledge, connections, and business transactions without worrying about building your own data centers.  As cloud computing becomes more commonplace in the lives of everyday consumers, government is considering bringing or defining new policies to govern the emerging cloud computing realm. These polices might very well help us to secure economic and technological dominance in the burgeoning realm of cloud computing, or it could fall behind the rest of the world. If government adopts Cloud computing, soon it becomes a strategic infrastructure for the country. That leads to more control on how providers on how they build and operate their Clouds. In my view, defining “just enough governance” and securing the critical infrastructure and providing the trusted access, assertion, audit, peering, and control of the cloud infrastructure is critical to cloud computing success. Do we need ICANN like governing body? Do we need an independent clearing house to help us to verify and audit identities in the Cloud?  What do you think?

TAGS: