Computerworld

Increase rack density to avoid buying new

The life of current gear can be extended, says Tom Yager
  • Tom Yager (Unknown Publication)
  • 10 June, 2007 22:00

Despite the bait dangled by Intel and AMD — twice as much server in the same space every two years — that rule of thumb won’t work for capacity planning. We can’t really count on cores per socket doubling on alternating years, even though I believe it’s likely.

Now that I’ve dumped cold water from my half-empty glass on the double-up strategies of Intel and AMD, I have a surprise for you: I have a full glass in my other hand and an optimistic gleam in my eye. I’ll reveal all, but first I have to set it up. The numbers are rough, but I’ll do my best to make them realistic.

For illustration’s sake, let’s say we run an e-commerce site that sells office supplies. We customise online catalogues so that companies can let employees order supplies from a limited catalogue chosen by management — ours is a highly dynamic site with substantial computing demands. Today, our standard rack server is a 1U, two-socket machine with two cores per socket. We squeeze 40 servers into a full-height rack. For isolation and fine-grained load balancing of a .Net e-commerce solution, we want to split our rack into as many Windows Server virtual machines as possible. Our response time tests led us to allocate three virtual servers per rack unit, yielding 120 VMs per rack.

Next year is projected to be a big year for growth. But rather than add another rack, we opt for an increase in rack density, so we bring in a few dual-socket, quad-core Xeon servers for capacity testing. Using the same response time metrics, our tests show that we can get 252 virtual servers per rack or between six and seven VMs per rack unit (improvements in the architecture of the four-core chips account for the greater than 2X number of VMs).

But look out — four-socket, four-core machines are already upon us, giving us sixteen cores per rack unit, not eight. With response time metrics that match our servers with dual-core CPUs, our new rack weighs in at a whopping 840 virtual servers, or 21 VMs per rack unit. Insane? Bear with me and you may conclude that 21 VMs per rack unit is actually a lowball estimate.

Some of the magic multiplication depends on the specific hardware and OS — in this case a quad-cored Opteron running Windows. At least one four-socket, 1U Opteron server is already on the market, so a 16-core server by next year is assured. Comparing dual socket/dual core to four socket/four core Opteron surpasses even a thrilling 4X density factor, because quad-core Opteron is Barcelona, a completely redesigned CPU and bus. Barcelona is considerably faster per core and the new CPU adds a Level 3 cache (unique in x86-dom to AMD), power management sorcery, and a HyperTransport 3 system bus that’s three times faster than its predecessor. Latencies for many compute and I/O operations will plummet with Barcelona servers.

Further density boosts will come from advances in host/guest virtualisation which, in racks as dense as I propose, will be the most cost-effective solution. Microsoft Windows Server Longhorn’s use of para-virtualisation, thin hypervisor and virtual symmetric multi-processing will boost best-case scaling dramatically. Performance increases in Internet Information Server (IIS) 7.0 and .Net 3.0 will translate to reduced response times, allowing us to pack more user capacity into each VM.

I’m counting on faster server-to-server interconnects as well. HyperTransport 3 and PCI Express 1.1 now have external cabling standards, and both 10Gigabit Ethernet and InfiniBand have fallen to manageable per-port prices. However, 10gigE and InfiniBand switches are still outrageously priced, but according to Mellanox, 10gigE-peered server links can be set up without switches. This enables groups of servers to share resources, and the scenario I have in mind involves using server-local hard drives with hardware RAID to provide scalable networked storage.

I can’t predict your reaction, but I think this is the kind of leap in x86 server density that we’ve been waiting for. You can certainly get less by planning against a simple 2X or 4X equation, but with respect, you’d be a fool not to invest the minor additional effort that pushes you higher, and quite possibly higher than my imagination has taken me.