Rise of Virtualization Could Boost Grid Computing
The growing popularity of virtualization may renew interest in Grid computing, as users couple virtual machines with job scheduling software and take advantage of virtual infrastructure to share computing resources on a global basis. One of the key benefits of virtual machines is that they dramatically simplify the process of migrating workloads across a network, a process that is a fundamental underpinning of Grid computing. Because virtual machines isolate applications from the details about the servers on which they are hosted, it becomes possible to move virtualized workloads from one machine to another with a minimum of disruption. Many virtualization platforms enable the entire state of a running virtual machine to be captured in standard files, which can be seamlessly transported across a network using shared storage. With the ability to coalesce an entire server into a few files, it is relatively easy to move them to another host on the fly, and with live migration support, virtual machines can be migrated while their applications continue running.
The vision of tapping compute power from a "Grid", i.e. a global set of resources that can be tapped into any time, and from anywhere, continues to appeal. Grid computing remains a powerful technique for enabling the rapid completion of very compute-intensive applications, and also promises some economic benefits, such as utilizing spare resources in other time zones that are in a period of reduced business activity, or -- given the growing concern in the IT industry with constraints in natural resources -- taking advantage of datacenters where power and space are more affordable. The opportunities for Grid computing to boost collaboration and reduce costs generated some excitement in the industry a few years ago, and prompted some systems vendors to invest considerable resources in developing and marketing Grid solutions.
However, much of the Grid computing vision yet remains to be fully implemented, except with relatively specialized applications, and in certain types of environments, such as research and education organizations, or leading-edge financial services companies. One of the barriers to the adoption of Grid computing is the relative complexity of customizing and adapting workloads to make them suitable for hosting on a Grid. Job scheduling software, which automates the assignment of workloads to hosts based on the priority of a job, is relatively mature, but it has traditionally been used to manage resources at the level of individual applications. In order to ensure proper execution of an application that is submitted to a Grid, it is necessary to provide the application with the software environment that it needs to execute. A Grid application may depend on details about the OS platform on which it runs, as well as tools and middleware needed to support its functionality.
Virtualization makes packaging applications for deployment on a Grid much more practical than before. After a workload is packaged in a virtual machine (or virtual appliance), it can run on any host supporting that VM format, and any software needed by the application can be included in the VM without disrupting the host on which it runs. Indeed, bonds are starting to emerge between VM platforms and Grid scheduling frameworks, whereby the Grid schedulers dynamically assign resources at the level of hypervisors, rather than applications. For example, XenSource recently announced a partnership with Platform Computing, one of the leading suppliers of Grid scheduling software. Platform already began supporting VMware in 2005, and it will now integrate and bundle XenEnterprise v4 with the latest version of its VM Orchestrator (VMO) package, which means that Platform's software will be able to dynamically allocate shared resources to the Xen hypervisor based on the priority of workloads it is hosting. Several VM vendors are growing their own set of distributed resource management tools optimized for virtual machines, including VMware with its Distributed Resource Scheduler (DRS), and Novell with its ZENworks Orchestrator, but tools such as Platform's have already been proven in use on a global scale.
Another example of the convergence between VMs and Grids is the rDistributor project, a collaboration between rPath, the developer of a virtual appliance platform, and the Open Science Grid, a consortium of research organizations. The project takes advantage of rPath's Conary distributed software management system, which makes sure Grid-based applications have the necessary software to execute by automatically installing the components they need in their VMs. Without virtualization, such automatic installation of components would be impractical, since it is unlikely remote hosts would allow arbitrary installation of support software for purposes of running a temporary application.
Virtual machines are now being deployed in organizations of all sizes, and with many types of workloads. Most users have yet to begin expanding the scope of their virtualization efforts from single servers to multiple systems. But over time, users will become increasingly comfortable migrating VMs across the network, in pursuit of reduced downtime and greater responsiveness to changing workloads. As organizations build out virtual infrastructure on an ever larger scale, many of the benefits that have long been promised for Grid computing will finally reach the broader market through the path of virtualization.
Comments