Like many a developer and network administrator, I honed my Perl programming chops doing the kinds of data reduction and analysis for which that language is ideally suited.
Yet no amount of Perl magic can save the day if your logs capture too little or wrongly focused data. And that’s a bit of a catch-22. To do good sleuthing you’ve got to have deployed the right kinds and levels of instrumentation. But as the data begins to tell its tale, it suggests the need for more or different instrumentation. Because the feedback loop is often attenuated, it’s a real challenge to strike the right balance.
Why not just log everything? Even today’s capacious disks fill up quickly when you turn your loggers’ dials to 10. So adaptive logging is becoming a hot research topic, especially in the field of security. The idea is to let your loggers idle until something suspicious happens, then crank them up. Of course, defining what’s suspicious is the essence of the challenge. Network forensics experts say that it takes, on average, 40 hours of analysis to unravel a half-hour of attack activity — and that’s after the fact. Will autonomic systems someday be able to generate and test hypotheses in real time, while adjusting instrumentation on the fly? I hope so, but I’ll believe it when I see it.
In the field of web analytics, it’s been fairly straightforward to correlate user interaction with the clickstream recorded in a web server’s log, but the changing architecture of web software now threatens old assumptions. When I gave a talk describing how rich internet applications can converse with web services, a web developer in the audience asked, “Where are the logs?” That’s a good question. Local interaction with a Java or .Net or Flash application won’t automatically show up in the clickstream, nor will SOAP calls issued from the rich client. You have to make special provisions to capture these events. That’s eminently do-able, but I worry that if logging isn’t always on by default, vital information will often go unrecorded. On the other hand, clickstreams don’t necessarily correlate well to behaviours you’d like to understand. The XML message patterns of a services-based application may enable higher-level and more meaningful analysis.
It’s fun to speculate, but meanwhile our systems keep accumulating logs. How can we deal with them more effectively? Over the years I’ve developed some simple strategies. In the security realm, for example, I like to watch the size of my logs day by day. That’s an easily obtained base line; deviation from it tells me to look under the hood.
When you want to do web analytics, here’s a tip: intelligent namespace design can dramatically simplify the chore. If you consistently embed categories, dates, or other selectors into your URLs, it’s easy to view your logs along those dimensions. I steer clear of content management systems and log analysis tools that don’t offer such flexibility.
Logs can flood us with information, or they can tell us compelling stories. We can influence the outcome by artful and iterative refinement of the data we collect.
Udell is lead analyst for the InfoWorld Test Center.