The engineering behind HP’s new generation of ProLiant servers appears to be an effort to cut down the amount of time spent on diagnosing, maintaining, and repairing servers. After analyzing tens of thousands of support cases on prior-generation servers, HP has sought in ProLiant Gen8 to address the issue of time-to-repair head on. According to Jeff Carlat, director of ISS software at HP, the new capabilities built into the Gen8 servers will let HP “reduce the total time to problem resolution by 66%.”
Many of the innovations of the Gen8 server line are new features designed to cut the time spent diagnosing, maintaining, and repairing ProLiant servers. HP engineers analyzed tens of thousands of “outage reports” on prior-generation servers and identified design changes to eliminate the causes of field failures, or at least reduce the time spent fixing them.
It’s refreshingly honest for a server vendor to admit that failures and bugs are problems that need to be addressed head-on. As a design engineer, I found that working on maintenance features was far less sexy than juicing server performance. However, because admin costs now dominate the datacenter, it’s the right strategy today for a server designer. HP research shows that three dollars are spent on server administration and operations for every one dollar spent on hardware, so there’s a larger potential for impact when designers focus on systems management rather than just delivering faster gear.
Another reason for this focus on maintenance is that HP, Dell, Cisco, and other server vendors all use the same Intel processors in their x86 gear, so there’s less opportunity to differentiate on performance.
Ditch the DVDs, and Out with the Agents
One set of changes in embedded management in ProLiant Gen8 will have a big impact on end-user administrative processes. HP has embedded three ProLiant management applications – SmartStart, Smart Update Manager, and the Service Pack for ProLiant – onto flash storage embedded on each Gen8 server motherboard. That means no more DVDs or separate utilities when installing or updating drivers and firmware.
Placing management applications on the motherboard corresponds closely with features found in Dell’s Unified Server Configurator and Lifecycle Controller, which is present in Dell’s 11G servers. HP’s implmementation appears a bit different, however – for example, Dell uses UEFI to embed those applications, while HP does not. But the result is similar: server admins will spend less time searching for software updates and building bootable USB keys when they need to patch firmware and drivers.
Gen8 servers will also require less patching than prior generations, since some remote management will no longer require HP agents (drivers) in the OS. For example, management tools will be able to obtain the status of individual hard drives in a RAID array without requiring a special management agent running on the host OS.
Additional aspects of the system will also be monitored. Hard drives and memory will be checked to confirm that they're HP-approved versions, which HP hopes will help when end users call for support. Individual DIMMs and drives will also be checked to verify they haven’t incorrectly been sourced from batches that were previously identified as failing components.
All system health parameters will be saved into time-stamped “Active Health” log files. These can be exported and provided to HP support to help speed up problem diagnosis. “If a system does go down, we want to be able to get that information back to our experts in the service and support teams,” according to John Gromala, director of modular systems marketing at HP.
Yea, Cloud!
The boldest new feature is HP Insight Online, which could spark a new paradigm for break-fix service on servers – depending on how willing end users are to share information with HP.
Insight Online will take server health information collected from servers and provide end users with a no-cost, web-based portal that shows system status, coupled with warranty and service-event history.
HP already offers a free, opt-in service called HP Insight Remote Support that allows networked servers to send some server inventory and health data to HP support. With Insight Remote Support 7.0, HP takes remote support “into the cloud” by giving end users access to that information through the snazzy Insight Online interface.
While the Insight Online interface will show both health information and service information for ProLiant Gen8 servers, for non-Gen8 servers, it will initially show only the service information. Eventually, HP will extend health monitoring to other equipment as well, including Integrity servers and HP storage gear.
The goal of these call-home features is to reduce the time it takes HP support to resolve customer problems. However, Insight Remote Support and the new Insight Online Portal require an additional level of trust between HP and the end user, since information must be transmitted from the server directly to HP. HP’s challenge will be to convince customers to enable these features, since the benefits might not be obvious to the end user until service events actually arise.
Bottom Line
By focusing on system patching and support services, HP is correctly focusing its server development on areas of server administration that have become time-consuming nightmares for server admins. As systems get more powerful and achieve higher levels of integration, the tools and smarts needed to diagnose and fix problems must get more powerful themselves. By focusing on one of the less-sexy aspects of server ownership, HP is demonstrating a good awareness of customer pain points and overall trends in datacenter administration.
Updated on 28 February 2012 with corrections noted by HP.






Comments