There's a good deal that's special about AMD's new Shanghai server CPU. It's fabulous science, and fun for those of us who get dewy-eyed over the prospect of a 25% faster world switch time and immersion lithography. It makes the x86 battle interesting again because it carries AMD into territory that it must fight hard to win — the two-socket (2P) server space — and where innovation is sorely needed. AMD beat Intel's next-generation Nehalem server architecture to market while closing performance, price and power-efficiency gaps between Core 2 and Shanghai. Just as it did in the old days, AMD now claims that its best outruns Intel's best despite having a lower clock speed.
(While Intel's first Nehalem processors were released on November 17, specialist Nehalem server chips won't be available until next year - editor). Shanghai, the name given to AMD's 45 nanometer quad-core Opteron, pulled into port well ahead of schedule, affording AMD an opportunity it has rarely had. While OEMs are notoriously tight-lipped about release schedules, several will undoubtedly tap the buzz of AMD's Shanghai launch, putting an array of Shanghai systems in the market before year's end.
Intel's messaging is all about the future, but AMD takes an interesting view that's more in line with the perspective of buyers: Squeeze the longest possible life out of the gear you bought two years ago, and keep the machine you buy today upgradable to state-of-the-art performance with nothing but a CPU swap. Shanghai uses the same 1,207-pin socket (Socket F) as dual-core Opteron, and that's not incidental. You can drop dual-core Opterons in a Shanghai server, or Shanghai CPUs in a dual-core Opteron server. As long as you're using the manufacturer's newest BIOS, the chips will just work. AMD is committed to continuous support for Socket F through the lifespan of Istanbul, its planned six-core CPU. Self-sufficiency and investment protection make a nice couple, and it'll be a pleasure to see those values return to the 2P space. Shanghai represents AMD's first major speed update in a while, with the clock ceiling raised from 2.3GHz to 2.7GHz. The average power utilisation for even the fastest Shanghai CPU remains the same as Barcelona's 75 watts, while 55 watt and 105 watt parts will appear in 2009. A 105-watt Shanghai brings to mind a factory-overclocked CPU specially tuned for sci/tech, high-performance computing, and workstations. That's just my guess. I think AMD wants to make it clear that while it is taking a fresh run at the 2P market, it still rules the roost in high performance x86 computing.
Shanghai took advantage of a smaller manufacturing process (thinner wires, smaller transistors) to make room for a healthy 6MB of Level 3 cache while supplying each core with an independent 512KB Level 2 cache. The precision of AMD's Immersion Lithography process reduces transistor power leakage, giving rise to AMD's claim of a 35% reduction in idle power utilisation relative to Barcelona. Lowering the idle power floor makes the dynamic power management capabilities first seen in Barcelona really shine. Another feature new to Shanghai is Smart Fetch, which allows cores to spend more time in a halted state by copying cores' L1 and L2 cache contents to Level 3 cache before halting them. AMD says that this happens transparently and that it lowers CPU power consumption by up to 21%, but I hope to see it surface as runtime down-coring, in which unneeded cores can be powered down under user control. In discussions about Shanghai, AMD refers to Barcelona the same way that Intel once tipped its hat to the short-lived 32-bit Core Duo (Pentium M) CPU. AMD credits Barcelona for shouldering the "heavy lifting" in Shanghai's design, which was considerably sweetened by process shrink and other enhancements. The way Barcelona went down made no one happy — not engineers, not management and certainly not OEMs. Shanghai should set all of that right again with a newfound commitment to stay in close touch with OEMs and major accounts.