« Three Impediments to Virtualization | Main | IBM Racks Up x86 Servers by the Boatload »

April 10, 2008

All Threads Are Not Created Equal

This week, IBM and Sun each announced multi-threaded systems: IBM's 64-thread Power 595  and Sun's 128-thread "Victoria Falls"-based T5140/T5240. Although both support large numbers of threads, these servers have far different approaches to handling workloads, stemming from the design points of their underlying processor chips. The 32 chip (64 thread) Power 595 would seem a natural fit in the data-base tier and the two chip (128 thread) T5140/T5240 would seem most appropriate for web-serving. Understandably, IBM's flagship Power 595 will attack Sun's SPARC64-based SPARC Enterprise family more than battle Sun's "CoolThreads" systems. And, Sun's T5140/T5240 will set its sights on IBM's midrange Power Systems servers rather than facing the top-end 595. While these new servers may not directly compete, each highlights their multi-chip/multi-core/multi-thread features.

But, you shouldn't equate a POWER6 thread with an UltraSPARC T2 Plus thread. To avoid glazing over the eyes of all but die-hard chip geeks (as an ex-processor designer, I actually enjoy this stuff) I'll attempt a simple synopsis of the two approaches.

Basically, POWER6 follows a more traditional design, while UltraSPARC T2 Plus focuses on the high levels of parallelism found in many web-related applications. Running at 5 GHz in the Power 595, POWER6 has two cores per chip and two threads per core. UltraSPARC T2 Plus puts eight cores on each silicon die and supports eight threads per core, for a total of 64 threads per chip. Running at up to 1.4 GHz, UltraSPARC T2 Plus (as with its "Niagara" predecessors, the T1 and T2 chips) focuses on the aggregate performance offered by multi-core/multi-thread rather than the single thread performance of a high clock rate design like POWER6. POWER 6 offers far more performance per thread while UltraSPARC T2 Plus packages far more threads per rack. A quick look at SPECjbb2005 benchmark results would show that a fully configured, 64-thread, POWER 595 can deliver over 9 times the performance of a 128-thread T5240 (3,435,485 vs. 373,405 SPECjbb2005 bops). Of course, the 64-thread Power 595 fills a full rack (not including I/O racks) whereas Sun's T5240 is only 2U, which would allow 2,560 threads to fit into a 40U rack of T5240s. (With the 1U T5140, 5,120 threads fit in a rack.)

The clock rate differential (up to a 3.5 ratio) is hardly the only difference between POWER6 and UltraSPARC T2 Plus performance. Obviously, a POWER6 thread is not the same as an UltraSPARC T2 Plus thread. IBM's Simultaneous Multithreading (SMT) design simultaneously fetches instructions from two threads for each core. And, both threads (per core) can be executing during the same clock cycle, as long as they do not need to use the same execution units. Since it has multiple execution units, each POWER6 core can simultaneously execute up to seven instructions across both threads, depending on which execution units are needed. Although the UltraSPARC T2 Plus can support eight threads pending per core, it only has two execution units per core and thus, at most, can execute instructions from two threads at a time. Be careful with terminology: although interchangeably used by marketing (and many analysts), strictly speaking "simultaneous" and "concurrent" do not mean the same thing to processor design engineers. Simultaneous indicates multiple things happening at the exact same clock cycle, whereas concurrent indicates that multiple activities could all be "active" but are time-sliced and interleaved such that not all are actually executing at the same clock cycle. A POWER6 core can support simultaneous execution of two threads; each UltraSPARC T2 Plus core can have eight threads concurrently active, but only execute two of them simultaneously. 

So, while "64/128 threads" may sound impressive, more relevant comparisons might look at performance per "U" rack unit, per watt (although energy consumption is hard to get accurately), and certainly price/performance. Naturally, the 16-chip Power 595 surpasses the two-chip T5240 for most workloads, but the T5240 fits in merely 2U (1U for T5140) whereas the processor cage (CEC in IBM terminology) for the Power 595 is 20U (and that is without I/O or the bulk power). The comparisons are not straightforward, but per rack mount "U", performance per rack, energy consumption per rack, etc. the systems each have their strong points. Admittedly, the Power 595 and Sun T5140/T5240 will not often face each other head-to-head. But Sun's latest "Victoria Falls"-based servers will compete against POWER6 chips in the Power 550 and Power 570.

Consolidation of workloads currently running on x86 is a goal of both IBM and Sun. Consolidation and virtualization go hand in hand. Both vendors offer a variety of virtualization approaches (AIX Workload Partitions / Solaris Containers, Micro-Partitions / LDOMs) to allow a single server to handle multiple workloads. POWER6's large caches (4 MB L2 per core and 32 MB L3 shared by two cores versus UltraSPARC T2's 4 MB L2 shared by 8 cores) seem better able to support more virtual images. And, IBM's dynamically re-sizable micro-partitions would seem best suited for workloads whose computing demands vary over time, compared to the statically defined LDOMs. Thus POWER6 would seem preferable for consolidating unpredictable, dynamically changing workloads. On the other hand, Sun's UltraSPARC T2 Plus offers so many inexpensive threads/cores per rack that worrying about efficient allocation of threads/cores may not be necessary.

Of course, the answer is "it depends" when determining which server might work best with which workloads. Customers need to consider the dynamic nature of workloads they intend to deploy on either POWER6 or UltraSPARC T2 Plus to understand which would best suit their needs.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d83451863e69e200e551c5e24f8833

Listed below are links to weblogs that reference All Threads Are Not Created Equal :

Comments

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Comments are moderated, and will not appear until the author has approved them.