Stories by Tom Yager

Is Apple's push notification enough for the iPhone?

Apple's iPhone is renowned for being the sole mobile platform that runs only one application at a time. If you want to write an instant messaging client for iPhone, knock yourself out, but recipients will be reachable only while your IM software has control of the screen. As soon as an iPhone user presses Home, the running application quits, voluntarily or otherwise. It's not allowed to leave so much as a thread behind to listen for connections from the network, do periodic GPS logging or run anything else in the background. Its competitors, such as the Palm Pre and RIM BlackBerry, have no such limitation.
Apple's not likely to give ground on background apps, but it does realise the competitive and functional gap that its one-app-at-a-time policy creates. So the iPhone 3.0 OS offers the APNS (Apple Push Notification Service) to provide a workaround. Is it a limited workaround or enough of a bridge to overcome the multitasking advantage of its competitors?
APNS does run in the background and listens for communications from a single server, and vendors or organisations are free to use APNS as a gateway to deliver short alerts and data messages to individual iPhones. Google, Yahoo and AOL still can't run background tasks on your iPhone, but as long as they link to APNS, they can push chat invitations that pop up on the invitee's iPhone no matter what he is running. If the user taps Accept, the IM client launches and the session begins.
IM is neither the only nor the best usage scenario for APNS. Most users will experience it in a familiar way: A small badge that appears on an app's icon, such as the familiar unread-message counter in the iPhone's Mail app or the latest new-invitation counter in its Calendar app. (The user has to see the Home page in which the app resides to see such indicators, however, which means leaving whatever app you may be in and seeing if other apps have any new notifications.)
If the app is running, it gets the notification immediately. If the app isn't running, the notification is held in the phone to be consumed at the app's next launch. If the iPhone is offline when the sender attempts delivery, APNS attempts to send the notification for 28 days. (The notification itself is a small 256-byte, arbitrary encrypted payload sent from a server-side app to a specific application running on a specific iPhone. So APNS is a general mechanism for shooting structured data to iPhone applications.)

iPhone 3.0 betaphiles upset the Apple cart

At least I have an excuse. Running pre-release operating systems and firmware in production settings is part of my job description. I accept that "beta" items are exempt from expectations of day-to-day stability, backward compatibility, performance and feature completeness. When I took the iPhone 3.0 OS as my one and only system software for the device, I was fully prepared that existing apps would break, some software on App Store would prove incompatible, the device would freeze up, and in any imaginable way on any given day, the beta firmware would show itself as less than firm.
That's the point of a beta. It's the price that admins and developers pay for the privilege of knowing what's coming next. In the case of the iPhone, that's essential knowledge. iPhone 3.0 is a platform overhaul: new OS, new APIs, new SDK, new tools and new rules for App Store approval. Apple is dramatically changing the game. In the next three months, every iPhone owner will have a brand-new phone.
I take the frustrations of pre-release software in my stride and keep the glitches to myself. Most people do not. That's why handset manufacturers limit the distribution of pre-release system software. By nature, beta OSes and firmware destabilise the platform. For that reason, Apple wraps the iPhone beta in a barbed-wire NDA and issues the stern mandate that iPhone devices registered for development must be used only for development. Apple makes this rule while being aware that it is impractical and unenforceable.
App Store is jammed with titles from sole proprietorships, while you and I both know that the luxury of an extra, activated iPhone 3G is one that the typical iPhone developer cannot afford. In general, if you're using iPhone 3.0, it's likely all you've got.

Northern spring prompts tech wish-list

Spring, here in the US, finds me out of hibernation, with clear vision, a new mission and power tools in hand.
My new mission is a sort of prime directive: To make my short-list of technology that I use and recommend, hardware must be engineered to rise in value the longer I own it. I demand that equipment be scalable, expandable, interoperable and open, so that I don't have to wait for a vendor to improve it. With computers, I'm after five-year gear, meaning that I not only want equipment with a useful lifespan at least that long, but equipment made by vendors that make an outward commitment to protecting the continuous appreciation of my investment and have a track record that backs that up.
Apple and AMD are two companies that fit that mold. Apple constantly improves its systems through software. New major releases of Mac OS X are as hotly anticipated as they are, because each remakes the Mac in some substantial way. It's as if every Mac owner gets a new computer every couple of years. You have to own a Mac to understand the phenomenon and you need to be a professional Mac owner, to appreciate the bottom-line benefit that installing Snow Leopard on an existing Mac will deliver.
But this time, another factor plays in. On a new Nehalem Mac, Snow Leopard is no longer bound by Intel's broken, legacy PC bus. The speed of access to memory is tripled by Nehalem's on-board memory controllers, opening possibilities to Mac owners that have only been available to users of RISC and AMD Opteron to date. Snow Leopard will benefit all Mac users, but Snow Leopard on a Nehalem Mac paves the way for the software X factor upgrades.

Rich online apps can learn from old IBM 3270

Convenience, low or no cost bundling and the thrill of "the new" draw users and subscribers into clouds and to other varieties of online applications (Web 2.0, RIA, mobile) and services. By the same token, a shot at recurring subscription revenue, brand/platform stickiness, elimination of piracy, or a non-paying user base primed for targeted marketing is carrying businesses into the online space. It's unstoppable.
Today, all Mac desktops and all popular varieties of professional mobile devices ship with online apps and associated services so deeply wired that they're part of the platform. Over the next couple of years, commercial client software will take on split duties as self-contained applications and "cloud terminals", with appealing functionality set aside for users who subscribe as well as buy.
There's little downside to using online apps as long as you apply sensible standards. Make sure your cloud storage is locally backed up, don't ship anything to the cloud that you wouldn't want shared with the whole world (unless your service guarantees security), guard your online identity, and don't create tacit trust relationships between your LAN and a cloud by, for example, folding your company's shared calendar and contacts into local copies that are synced to Yahoo, Google or MobileMe.

Unix - IT should ask for it by name

What is Unix? I've known Unix long enough to know that the trademark and industry usage want the name rendered in all capital letters (UNIX), although many publications (mine included) don't like it that way. Having written about Unix for a couple of decades, I've come to take for granted that everyone knows what Unix is. Certainly, no one would ask me what Unix is. I get jabbed all the time to define Linux (kernel) whenever I use the term, and I've been asked why I don't refer to Intel Macs as PCs (proprietary platform). But nobody has ever noticed me refer to Unix and written to say, "What do you mean, Unix?"
I wish someone had. Figuring that the economy would make Unix vendors the ready pan of market analysts praying to get something right, I had in mind to write a sort of You Don't Know Unix column. The trouble with an in-your-face headline like that, is that it turns embarrassing when the author has to admit he couldn't meet his own challenge. The simple question I asked myself one recent morning became the deep thought into late the following afternoon. I had all kinds of clever things to say in defence of Unix, but none of it was relevant to IT, and IT deserves a relevant look at Unix as something other than a culture, a history, or a meaningless banner over all operating systems in the "not Windows" category.
The fact is Unix matters to IT and for a reason that may not occur even to those shops that already have it. Unix matters for a reason that escapes analysts' notice. I missed it, too. It's that little circle with the R in it. IBM, Sun, Fujitsu, HP, and Apple sell proprietary enterprise operating systems branded AIX, Solaris, HP-UX, and Mac OS X Leopard. These are very different. Mac OS X Leopard is very, very different.
IBM, Sun, HP, Fujitsu and Apple also sell Unix. All of these vendors' Unix implementations are a precise match for the others, and vendors sign a contract guaranteeing IT that applications written for one Unix can run on all Unixes, and that a network of any size and reach can mix Unixes at will with guaranteed interoperability across vendors. One (large) set of documentation covers all Unix implementations for architects, developers and administrators. Non-branded Unix docs make no mention of the brand on the server machine, and they have no need to. All Unix, every Unix, works as described in that one set of manuals.
Unix is not a core of source code common to all of the proprietary OSes I've described. If Unix were software, it would have died out during battles to own it. Unix is a registered trademark of The Open Group, which keeps the Single Unix Specification 03 (Unix 03, or just Unix). The specification — which the proprietary operating systems of IBM, Sun, Fujitsu, HP and Apple adhere to — definitively and inclusively describes Unix from the microscopic level (the C language and system data structures) to the command line. Any skills, staff, source code, infrastructure and solutions you invest in to Unix are portable across IBM, Sun, Fujitsu, HP, Apple and generic 32 and 64-bit x86 hardware.
The Unix 03 spec is open, meaning fully publicly available in its final form. Unix 03 is drawn from several contributors and provides a single approach to key modernisations, including mixed execution of 32-bit and 64-bit code, and incorporating internationalised text. The spec doesn't get into the buried plumbing, only what's visible to users, admins and developers. For that matter, the spec doesn't care how a vendor implements it. IBM and HP have closed source implementations, while Sun and Apple have opened theirs.
Nothing prevents Microsoft, Red Hat, Novell, or anyone from attaining the Unix trademark. Yes, Microsoft could conceivably slap the Unix trademark on Windows, but for a few million lines of code. The reason that only five vendors ply the trademark is that the Unix validation is the easy part. Or, rather, the cheap part. (Validation is hardly easy; ask Apple.) The dotted line that a vendor signs to use the trademark is the contract with Unix customers, software vendors, and competitors — the ecosystem — guaranteeing full interoperability across vendors.
The trademark puts legal teeth in the Unix spec. Though Unix's smooth interoperability — and the freedom that independent software vendors enjoy to have one database or CRM code base cover so many different platforms — is a product of cooperation among Unix vendors, IT operations, universities and professional organisations. The Open Group didn't make that happen; it's always been the case. The trademark merely provides IT organisations that need to be sure, without the need for digging, that Unix means something, and it does. It means that Unix enterprise solutions work, and work together, without regard for the brand on the hardware.

Where does Intel's Nehalem get its juice?

When I first heard "Nehalem", it called to mind the noise that Felix Unger from "The Odd Couple" made to clear his sinuses. If Intel gets its way, its Nehalem CPU and system architecture will have a similar effect on the clogged-up market. I'm hoping it will clear bogged down workloads. It's not that we're suffering mightily with the likes of Intel Core 2 Duo and AMD Opteron Shanghai and Phenom II, but it's time we were rocked by something bigger than a speed bump.
Nehalem isn't strictly new, but I hung back until I could see it in a 2P platform (meaning two CPUs, or two sockets, if that's clearer) that shows it to its best advantage. An early look at such a platform is generally supplied only by the chipmaker itself, with one exception: Apple. It's the only first-tier system maker that's willing to have its high-end machines held up as exemplars of a CPU or system architecture, knowing that, in stories like this one, the architecture is given higher billing than the system itself.
2P Nehalem came to me in the guise of Apple's eight-core Mac Pro. OS X's Activity Monitor shows a pair of Nehalems as a 16-core CPU. Hyper-Threading has returned to the x86, but its role and potential are much changed since it went into rehab after the fall of Pentium IV. With a smart OS scheduler and some smart programmers, Hyper-Threading could do some real damage this time around. You may recall that with single-core CPUs, Intel claimed that Hyper-Threading was capable of boosting performance up to 30%. Apple's published benchmarks show that an eight-core Nehalem, running at 2.9GHz, bests its prior 3GHz, eight-core Mac Pro. By my rough weighted averaging and using Apple's own numbers (not mine; that comes next), Nehalem turns in 60%-70% higher numbers.
Taking on faith that Apple's numbers are accurate — after taking heat for past sins, they tend to be — I'm left to wonder where Nehalem gets that extra performance. Perhaps it draws some from Hyper-Threading. Some of it unquestionably comes from DDR3 memory, the next step up from DDR2, which is the prevailing standard. AMD criticizes DDR3's higher latency, saying that comparing fast DDR2 to DDR3 is a wash. AMD asserts this while having DDR2 and DDR3 on its near-term roadmap. Nehalem's NUMA (Non-Uniform Memory Access) architecture, which assigns independent banks of memory to each CPU, may counterbalance latency to some extent, the way it helps counterbalance lower cycle speeds on DDR2 with Opteron. Simultaneous memory access does that. It's my view that the Nehalem CPU's on-chip memory controllers and NUMA probably make a bigger difference than the kick up to DDR3. However the magic is done, Apple is claiming a 2.4X rise in memory throughput. I'd like to see that.
Nehalem also marks the return of the Level 3 cache that was present on some NetBurst Xeon CPUs. Level 3 cache is shared by all cores on a chip. Intel has always been a big believer in (big) cache, but I think it had pushed Core 2 Duo's shared Level 2 cache as far as it could go. Three-level cache is the right idea, and this also cuts down substantially on the number of cache probes the system does to make sure one core doesn't have a different picture of memory than another. The last of the significant improvements is TurboBoost. That's a technology to which I need to devote more study. By Intel's pitch, TurboBoost senses when tasks ordinarily spread across cores can be handled by fewer cores, potentially running at a boosted clock speed. Apple tells me that it'll be hard to see this in action with user-level facilities like Activity Monitor. Fortunately Apple and Intel supply tools that allow a closer look.
Nehalem begs for that closer look.

AMD's Istanbul chip is a breakthrough

AMD can't say it, but Istanbul, its six-core, 45-nanometer processor, is ready. (Officially, it's set to be launched in the second half of this year).

Self-guarding storage for ultimate server security

Executable code and supporting data that doesn't change often — along with software and data that has an operational impact when inadvertently or maliciously altered — should be stored on media that is read-only or volatile (that is, reverts to a known stable image on reboot).
We place a lot of faith in operating systems to enforce user privileges, and it is possible to mount storage partitions and volumes as read-only, but these protections are far too easily overridden. The data on a "secure" server can be altered from a console by booting from a Linux or DOS CD or a USB flash or external hard drive. In clusters and farms with fail-over and load balancing, access to the console of any one server can compromise secure data shared by the entire enterprise.
I have always marvelled at the elaborate security and RAS mechanisms that IT is compelled to add to servers using inherently vulnerable, mutable software. Readily available hardware technology can make it difficult, or impossible, to alter such data as OS kernels, drivers, privileged loadable modules, and configuration parameters set at install time that comprises a server's sacred ground.
Why do these most sensitive files need to be mixed in with ordinary data when it's so easy to set them apart on media that can't be altered? And why do they need to be stored in multiple files at all? Once a configuration is solidified, a server can bootstrap from a read-only RAM image, like the ones created by laptops when they go into hibernation, that contains the kernel, drivers, and fixed configuration.
I'll tell you what brought this to mind. On a recent trip to an electronics store, I got sucked into an impulse buy: A 4GB SDHC USB flash card that cost $19.95. That's large enough to hold a 64-bit OS' privileged code, configuration data, and then some. The card is just about as fast as a hard drive, plenty fast enough to boot a server in a reasonable amount of time, and it's many times faster than optical. To make that storage accessible to my MacBook Pro, which doesn't have an SDHC slot, requires installing the card in a reader. The inexpensive flash reader I'm using is a simple, cheap USB device built from a single chip — a microcontroller — and a handful of discrete components. That got me thinking.
I'm experimenting now with a microcontroller from US semiconductor manufacturer Microchip called the PIC32. This is a one-chip, 32-bit, MIPS architecture (the CPU used in Silicon Graphics workstations in the 1990s) RISC computer with pipelined execution that's capable of executing over 150 million 32-bit instructions per second when running at just 80MHz. That's paltry compared to even a cellphone, but this is a device that only requires a power supply to operate.
Along with the CPU core, the 100-pin chip, six of which fit on my fingernail, has 512KB of built-in flash memory (a 4GB address space and an external bus leave room for much more) and a pair of USB controllers. Microchip isn't Intel or AMD; there's no bragging about advanced fabrication processes. But the chip sells for about US$3 (NZ$5.89) in quantity, US$9 if you want to buy just one, and it has two built-in USB controllers.
Using an open source library supplied by the vendor, the PIC32 connects to any computer as a USB storage device, meaning that you can boot a server from a bitstream supplied by a computer. It can perform encryption, log arbitrary information, intelligently alter boot data — for example, disabling drivers for external ports or supplying a safe or maintenance mode boot image if it suspects the system is compromised or faulty — all without high-level OS awareness or involvement and without the possibility of external tampering.
I have a pretty high standard for tamper-proof. You can seal a microcontroller, along with a 32GB MicroSDHC flash card, a lithium battery, and a piezo alarm, in a barnacle of plumber's putty (for example) that you can stick to the inside of your server chassis. If anyone snips the wires or tries to drill a hole in the package, or so much as opens the server case without an approved RFID, fingerprint, voice scan, key, password, or remote disarm command issued directly to the controller via Ethernet, the controller can scramble its memory, blow an on-chip fuse that renders the device useless, and blast a 130-decibel alarm in a matter of milliseconds. With very little effort, any attached external flash memory can be fried with a voltage spike or short, or literally fried by heating a length of wire. Pack some gunpowder in there to set off the smoke detector in the server room while you're at it.
An even simpler setup with a 50-cent, 8-bit microcontroller can make a parallel ATA hard drive non-writable, or useless unless conditions that you've defined are met. Imagine equipping the controller with an accelerometer, which is also extremely cheap, so that a hard drive will burn itself out if a pattern of motion is detected indicating that the PC, or just the drive, is being moved.
The gunpowder may be a bit much, and programming your own microcontroller to secure a server isn't a project you're likely to take on. In that same trip to the electronics store, I bought a $50 wi-fi router that can be flashed to run open source firmware, making it possible to alter or block LAN traffic on the fly in a manner that's opaque, and the device can "brick" itself, completely cutting off external access in response to an in-house threat. Security shouldn't kick in after the OS boots. That requires assuming that the OS is trustworthy, and practically all exploits are based on compromising the OS. Even just booting the OS from a CD or DVD that's sealed into an optical drive affords more protection than any software you can install on your hard drive or networked storage. If you can't change it, you can't hack it.

AMD spins Moore's Law in IT's favour

In 64-bit servers, AMD and Intel will soon be on the same page, architecturally speaking. But these similar ends were reached by very different means.

Parallelisation: the next performance horizon

Server and workstation innovations like multi-socket and multi-core technology have steadily boosted performance density — they've increased the work that can be done in a given amount of space and with a level amount of power. This serves a fine purpose, but it stops short of the more rewarding objective of releasing all of the compute power stored in a watt. 2009 will bring landmark changes to processor and system architectures. In the x86 space, AMD and Intel have already gotten an early start with Q4 deliveries of Shanghai and Core i7. As advanced as these new architectures are, we'll see an incremental increase, not a history-making leap, in common benchmarks and production performance. The problem isn't that we lack incredible hardware. It's that software hasn't kept pace.
A history-making leap in x86 server, workstation, desktop and notebook performance is approaching. It has nothing to do with chip manufacturing process shrink, clock speed, cache, or DDR3 memory. It's about an intelligent way to harness hardware advances for something better than working around that two-ton millstone of PC design: the fact that every process, while active, expects to own the entire PC, so elaborate means are required to permit serial ownership of the physical or logical system. That arrangement foils efforts to unlock the potential of multi-socket, multi-core, multi-threading, and coprocessing hardware. That potential is defined as the system's ability to execute tasks in parallel, and on PCs, we're miles from the ideal that modern hardware makes possible.
CPUs do the best they can to keep multiple independent execution units inside the processor busy by rescheduling instructions such that integer, floating point, and memory access operations happen at the same time. Compilers, some better than others (raises a glass to Sun), do what they can to optimise code so that the compiled application operates efficiently. Then it's up to the operating system.
Operating systems deal CPU ownership to processes in a round-robin fashion. Multiple cores provide a brute force approach to improved performance by giving the OS multiple places to park processes that expect to have the system to themselves, but a closed, generic OS is a poor matchmaker between application requirements and CPU resources. Virtual machine managers play to this weakness by gating guest OSes access to system hardware, effectively restricting the number of places an OS can park processes for the sake of parallel processing.
The problem, expressed in distilled form, is that the CPU, the compiler, the operating system, and the virtual machine manager pull various strings towards a similar end, that being the efficient shared use of fixed resources. Since every actor fancies itself in charge of this goal, none is. We'll have more cores and more sockets, live migration to put processes on idle cores regardless of location, but we should also start attacking the problem from another direction.
At a high level, a fixed, mandated, thoroughly documented, deeply instrumented framework is the top priority. A pervasive set of high-level frameworks from the OS maker serves two purposes: Its existence and documentation obviate the need to reinvent anything already functional in the framework, and low-level code within the framework can be changed without disrupting applications. Look to OS X for frameworks in action. Frameworks are system and inter-framework aware. Sweet global optimisations can be applied across the entire framework stack, and between frameworks and the OS, that aren't possible with libraries that don't dare make assumptions about their environment.
I'd like to see the relatively fixed formula for determining the amount of time a process is permitted residence on a CPU (the quantum) replaced with an adaptive approach. The longer a process remains in residence on a core, the more compile-time optimisations can come into play. Code can be optimised such that opportunities for true parallel operation are identified and exploited, but getting the greatest bang from this technique requires that the OS take some advice from the compiler about how to schedule parallelised applications. That developers must now do this by hand is a limiting factor in parallelisation's use.
Another approach to parallelisation is to set aside cores and other resources as off limits to the OS scheduler. If you know that you're operating on a 100% compute workload, you could safely cordon off an x86 core or two, a block of GPU cores and the nearest available memory for a coprocessor. Imagine the effect that setting aside a logical coprocessor just for pattern matching would have on server performance. It would accelerate databases, data compression, intrusion detection, and XML processing, and if it were wired into the framework that everyone uses for pattern matching, the same code would work identically whether the coprocessor were present or not. The bonus is that such logical coprocessors require no additional hardware.
I've liked this concept so much that I've looked into making it work. Operating systems don't like being told that there are parts of a system they can't touch, and developers would have to craft coprocessor code carefully so that it touches nothing but the CPU and memory. It's effectively an embedded paradigm, one that I believe is underutilised but for which OSes and compilers are not tuned.
Fortunately, I don't have to carry my logical coprocessor beyond theory. For years, I've gotten the stinkeye from vendor reps and colleagues when I suggest that workstation and gamer-grade graphics cards belong in servers. Forget about using the display; that's a distraction. 3D graphics cards are massively parallel machines capable of remarkable feats of general computing. Their computing not only runs independently and parallel to the rest of the system, but multiple tasks are run in parallel on the card itself. The snooty dismiss as irrelevant the magic evident in a desktop PC's ability to take users through a vast, complex and realistic 3-D landscape with moving objects abiding by laws of physics and (sometimes) convincingly intelligent adversaries. If you took the 3D card out of this, you'd lose a lot more than the graphics. The PC, no matter how fast, wouldn't be able to handle a fraction of the computing that has nothing to do with pixels.
That's the ideal in parallelisation, and unlike my logical coprocessor concept, the hardware is already in place to tap GPUs for server-grade computing. AMD has the ability to gang four fire-breathing workstation graphics cards together through technology it calls CrossFireX. In this configuration, a 2U rack server could have access not only to 16 AMD Shanghai Opteron cores and 64 GB of RAM, but hundreds of additional number-crunching cores (depending on how you count them) running at 750 MHz, with 8GB of GDDR5 memory all to themselves. It takes software to unlock that, and we're finally turning the corner on industry consensus around a standard called OpenCL for putting GPUs to work for general computing.
That's heartening, but we still need a change in paradigm. There is enormous potential in the GPU, but similar potential can be extracted from resources that standing servers already have. As long as parallelisation has been around, it remains a stranger to the PC. What a waste.

Down coring an important innovation by AMD

The 2008 MacBook Pro sitting in my lap is my favourite currently-available commercial notebook, but if you asked me what my all-time favourite is, I'd have to say it's Apple's PowerBook. I could run it for a solid six hours on one charge, something that no modern notebook with a single battery can do, even though it had an old-fashioned backlight and a GPU that, unlike MacBook Pro's, could not downshift to a low-power mode.
What made PowerBook such a marvel was that Apple brought embedded system principles to its laptop design. From the single core, 32-bit 1.6GHz PowerPC to Apple's custom power-pinching custom silicon, PowerBook was made to run forever on a charge. By PC standards, a 1.6GHz clock was dog slow, but Apple believed that CPU speed was irrelevant as long as the user experience was satisfying. Apple was right. The Mac and OS X were designed to handle events very rapidly, so much so that users perceived little lag, if any, between action and reaction in the GUI where disk access was not involved.
Apple dumped the PowerPC, but slow 32-bit CPUs are more popular and pervasive than ever. They're in cars, aircraft, medical equipment, DVD players, Fibre Channel routers and satellite receivers. A 32-bit CPU running Linux powers my HDTV, and I needn't mention that slow, cool-running microcontrollers (microprocessors with integrated peripherals) are in every cell and smartphone in use. The prize characteristic of these little CPUs is their minuscule latency. They're relatively pathetic at general-purpose computing — they are not number crunchers. Their specialty is to react so rapidly to a great many stimuli that it appears to be a zero-latency system. Microcontrollers watch sensors, paint the screen, sift through the binary chatter of multiple radios, encode and decode voice data in real time, stream high-bit-rate media sources, manage storage... It seems like a lot to ask of a 400MHz CPU powered by a featherweight battery, but a microcontroller is designed to excel at two things: sprint and sleep. This characteristic distinguishes embedded and mobile systems from general-purpose computers. Your smartphone reacts. Your computer computes. If this is so, then what does a server do?
For certain workloads, I submit that a server works better based on a low-latency microcontroller model than a high-performance supercomputer model. For example, an edge server needs to filter and direct network packets at wire speed. This is not compute-intensive work, as evidenced by the fact that a black box firewall/router can run on a 32-bit microcontroller with a DC power supply. However, because servers and server OSes are poorly designed for this work, the CPU load seems to indicate that more systems, or more powerful systems, must be thrown at such duties. One IT guy I chatted up at a conference ran several racks of systems that served dynamic web pages to external users. That operation had gone fully virtual and reaped enormous successes. I asked him what criteria triggered a scale-up. "Latency," he said. Interestingly, aiming more resources at one latency-sensitive task lowers overall headroom, and I'll wager that once a relationship between latency and equipment was established, budgets for equipment started climbing precipitously.
We are almost heading the right way. AMD has introduced a concept for which I've long awaited: Down Coring. This involves the powering down of cores in a multicore system. At present, this requires a BIOS settings change and a reboot, and I've only seen it implemented in client machines so far. In a big 16-way Shanghai box, how much down coring support will be supplied? If you fall back to one core, what of the others? I submit that they should be fully halted, but at present, each socket must have at least one core running to keep memory accessible. With a sufficiently smart OS, all cores but the one to which the South Bridge I/O controller is connected could go off-line. The OS would have to unmap memory assigned to powered-down cores, but the process differs little from that required to prepare migration to a dissimilarly configured server.
This can be done on an even grander, or rather smaller, scale. As an exercise, I turned a machine with a 16MHz 8-bit CPU into a simple internet whitelist mail manager. After about 15 minutes with no whitelisted mail, the mail servers proper were powered down. It was not until a whitelisted request arrived that the rough equivalent of one server core was awakened, at which point the whitelisted mail was routed as normal. I resorted to Windows' core affinity to make this happen, so power savings weren't significant, but the drop in server CPU utilization and heat were dramatic.
It was only a proof of concept because the server took so long to power up that subsequent messages would be missed, and delivering only whitelisted mail is not workable in production. Still, it's an extreme, yet workable example of the power of downscaling. If a battery-powered 8-bit single board computer can do this, it's easy to imagine something with the brawn of a smartphone doing far more, even running blacklist and DNS checks on e-mail and serving static "Please wait..." Web pages (see my Green Delay post) while it waits for the server. There are several server tasks that are event driven, rather than compute driven.
When a current PC server spends most of its time waiting for work to do, it is falling into event-driven mode, and yet all of its CPU sockets are powered up and all memory, even unused memory, draws power. PCI Express and USB peripherals can be ordered into a power-saving state, but as a rule this is not done with servers. An AMD server can effectively be placed in low-power event mode by suspending all non-essential processes. Eventually all of these suspended processes and their RAM will be paged out to disk, providing an opportunity to defragment storage down to that which is attached to the minimal number of cores. You'd find that there are more of those useful quiet times than you expect. I do know that it takes less than a second to go from this event-only mode to one kicking on all cylinders.
Virtualisation's potential role in this is unclear. I know that it can do a real-time migration to a larger server, but can it squeeze into a smaller one? This concept runs contrary to the current enterprise mentality that keeps unused servers ever-ready to leap into action, and considers an idle virtual machine to be an overall energy win. No, to get that, you need to strip away real resources, not virtual ones. The embedded world offers many lessons for green server designers.

Lack of wi-fi doesn't hinder BlackBerry Storm

I had already filed my review of the BlackBerry Storm, aka BlackBerry 9530, when Galen Gruman, an InfoWorld executive editor, pointed out a glaring omission in my review: I neglected to mention that BlackBerry Storm lacks wi-fi.
My initial reaction was along the lines of "how could they?". Come on, RIM, did you not survey the touchscreen market before you tossed your hat in? Wi-fi is just there on the mobile spec sheet template, right next to 3G, GPS, MP3 and H.264. Everybody expects it. I expect it. I shook my head in disbelief that an industry leader like RIM would let this category-defining feature go unimplemented.
It became my duty to take RIM to task for hobbling the otherwise impressive BlackBerry Storm. As I rewrote, my jerked knee began to relax, and the tone changed from "how could RIM leave out wi-fi?" to a more relevant question: Would anyone but a reviewer gripe about the absence of wi-fi in a BlackBerry that is otherwise stuffed with more features than you can find in a $200 handset?
If I hadn't taken a breath and taken the time to handle BlackBerry Storm and examine its specs again before I rewrote the review from the "absence of wi-fi" perspective, I'd have missed the point. Any product can be turned inside out and scrutinised for its apparent lacks. Sometimes it is a product's veering from apparent market-defined givens that makes it a hit. The wisdom of BlackBerry Storm's wi-fi design-out is supported by two cases in point, the iPhone 3G and Nintendo Wii, that set a fine precedent.

Review: BlackBerry Storm bridges business and lifestyle

The new BlackBerry 9530, or Storm, has the familiar fingertip navigation and flick-to-scroll gesture common to most widescreen phones. Apart from that, the Storm is very much its own device, unmistakably a BlackBerry in its strong messaging, connectivity, and extensibility, but carried to a new level of usability by a touchscreen display and a redesigned GUI.

Shanghai chip gets AMD back into x86 battle

There's a good deal that's special about AMD's new Shanghai server CPU. It's fabulous science, and fun for those of us who get dewy-eyed over the prospect of a 25% faster world switch time and immersion lithography. It makes the x86 battle interesting again because it carries AMD into territory that it must fight hard to win — the two-socket (2P) server space — and where innovation is sorely needed. AMD beat Intel's next-generation Nehalem server architecture to market while closing performance, price and power-efficiency gaps between Core 2 and Shanghai. Just as it did in the old days, AMD now claims that its best outruns Intel's best despite having a lower clock speed.

Why Windows 7 will be better than Vista

In the elevator to my hotel room, a 20-something man, too tanned and relaxed to be in the tech industry, spied the massive logo on the shopping bag-like package that Microsoft doled out at its Professional Developers Conference. "Windows 7, huh?"
"There's always another one," I said.
Without missing a beat, he replied dryly, "They need another one". This gentleman is not a registered PC. Ironically, my package's straps were tied around the handle of a less ostentatious rolling bag that cradles a new unibody MacBook Pro. Some things aren't worth getting into in an elevator.
It's really difficult for a savvy user not to bring a cynical, or at least sceptical, viewpoint to Windows 7 after the foot-shooting that was pre-SP1 Vista. What many end-users will see in Windows 7 is an effort to Mac-ify Windows, right down to enabling multitouch gestures on Tablet PCs, and copying Apple is instant, certain buzzkill.
Apple claims that Microsoft is suffering a drought of original ideas. Reading between the lines, Microsoft counters that Vista, before Service Pack 1 (it's proud of SP1 and later), was a mess for many reasons, but in part because every yahoo on the internet was invited to transmit his gripes and fantasies directly to Microsoft product managers, who were then duty-bound to take them seriously.

[]