Measuring server power-use is a minefield

There's no easy way, says Tom Yager

Comments

Since it was Earth Day recently, let us examine the criteria that IT brings to its purchases and arrange to make power efficiency a top priority. All it takes is the will, a little homework, and the embrace of the delusion that there's any fit and fair way to compare the power consumption of two similar pieces of equipment. Last December 27, SPEC (the Standards Performance Evaluation Corporation) announced the availability of its SPECpower benchmark. One would think that having the heaviest of the benchmarking heavyweights pour concrete on that most slippery of metrics would give us something to go by. InfoWorld has been waiting for its copy of SPECpower, which SPEC acknowledges is a first step, since January, and I have a guess as to the reason we're still waiting. Perhaps some members of SPEC, which is primarily a consortium of vendors, have encountered the same issue that I have: Power measurement is only 5% process. It is 110% policy. The total exceeding 100 is appropriate, because neither SPEC nor anyone else can ever call the book on fairness policy for power benchmarking closed. It shows wisdom on SPEC's part that it refers to SPECpower as a first step. But I think that SPEC started development of SPECpower with the wrong objective in mind, that being to derive a result (a figure of merit in SPEC's words) that tries to pour cement over that elusive marketing metric I hold in lowest esteem: performance per watt. I am pleased that SPEC has made an effort to quantify "a performance". It is essential to have a meaningful constant for accomplishment so that a formula containing watts, which is a fair and concrete measure of effort invested, approaches an expression of efficiency. SPEC's formula counts the number of cycles through a Java server workload over a period of time, while the power draw during the same period is charted. The figure of merit, ssj_ops/watt, really isn't bad as distilled metrics go. SPEC doesn't deal in squishy numbers; there's no "13 SPECfoo_marks". So ssj_ops/watt is pretty good, until you try to use it to compare two servers. Aside from fairness, there are some technical shortcomings in SPECpower, the granddaddy of which is reproducibility. If InfoWorld's Test Centre attempted to validate the SPECpower rating published by a vendor, which is the sort of thing the Test Centre likes to do, there is no chance that we'd derive a matching result. If our findings put the vendor in a better light than its own, they'd be overjoyed. If, however, we showed after much diligence that the vendor's published SPECpower results appear to be overstated, every vendor would seek to tear apart our testing methodology and policies, and the list I've compiled of foreseeable vendor objections is daunting. Our Test Centre will do a better job of levelling the playing field across vendors than vendors can (and want to) do themselves, because we can replace environmental variables that will vary across vendor testing facilities with absolutes. Those variables are doozies. SPEC requires disclosure of everything from system configuration to compiler flags in published results, but the impact of variations in compiler flags and clock speed of memory baffles buyers. Fortunately, we can make sense of these and translate them into buying advice. But with power, some variables that will appear to be satisfied through disclosure will actually be mercurial. The example I'll offer is temperature. SPEC requires the use of a temperature probe. You can establish a policy of pegging all tests at an ambient temperature of, say, 24 degrees, but your 24 isn't the same as mine. Try it yourself. Take a simple infrared thermometer and walk around your datacentre. Grab a baseline by pointing the thermometer at a piece of cardboard held at arm's length to get a basic ambient temperature. Then aim the thermometer at various surfaces, of varying materials and at varying heights, and varying distances from airflow. Compare the temperature of a server's top plate inside a closed rack to that of a similar server inside an open rack. Does a server that shares a rack with a storage array measure the same temperature as the server alone? Temperature affects every server in a different way. The design of a server's cooling system and its programmed responses to high temperatures say much, if not everything, about a server's quality of design. For example, the cooling system in a cheaply made server will have one purpose, that being to suck air from the front of the chassis, and possibly the sides, and blast it out the back. The fans themselves, being electrified copper coils, generate heat, and being kept spinning at several thousand revolutions per minute in open, probably particle-laden air, makes them subject to failure. Most servers don't care how much heat it makes or where it goes, be it into the air or indirectly into the intake of the server above it, influencing its efficiency. But you have to worry about heat because you pay to move that heat outside the building, accounting for a large percentage of your operating cost. Perhaps server efficiency has to take into account its contribution to the duty cycle of the compressors in your air conditioning. Try to measure that. My point is, don't expect easy answers to power efficiency. InfoWorld's Test Centre is taking on power testing and considering SPECpower as part of that plan. In the meantime, I can tell you the secret to avoiding the homework on this one, and it is my constant advice: Count the number of active server power supplies in your shop and commit to reducing that number over time. That'll do nicely for now.

Join the newsletter!

Error: Please check your email address.

Tags servers power use