SAN FRANCISCO (11/17/2003) - Despite the fact that BI gives data-mining pros the opportunity to answer strategic questions, it's not a very interactive or iterative process. That's the domain of BA (business analytics).
But even BA tools require you to know the right questions in advance and to be able to see patterns in presented information so you can ask the right follow-up questions.
One of the factors that limits analytics software is that it tends to be aimed either at a large swath of domain experts or at a small cadre of statistics gearheads. This leaves a gap that BA vendors are trying to bridge by building tools that have sophisticated analytical capabilities wrapped into interfaces that people without professional statistical knowledge can steer skillfully.
KXEN Inc.'s Analytic Framework 3.0.2 is a valiant effort at filling that gap. It supports iterative, interactive data exploration, bringing together data mining and BA with a high-powered engine that rips through large data sets surprisingly fast.
Analytic Framework will kick some serious grass in analyzing targeted applications and big data stores, but its user interface for midrange analytics professionals has room for improvement. Nevertheless, Analytic Framework's significant data-crunching capabilities and speed can crush BA challenges that would swallow most analytics software as easily as a petit four.
Analytic Framework's mojo resides in its foundation: the statistical learning theory of mathematician Vladimir Vapnik and the senior mathematicians on KXEN's board. The product's architecture is built on a set of eight analysis modules -- ranging from data cleaning to time-series data analysis -- accessible through an API or the Modeling Assistant wizard.
The Modeling Assistant guides you through problem specification, including defining a sampling strategy, loading data from standard file formats, and specifying one or more target variables (such as "profitable customer" or "failed circuit board") and variables to omit from the run. The software then uses those settings to train and build a model and report your results.
Although it's hard to spin out of control with the wizard, the system has many prescribed constraints about how imported data must look. It's not a truly muddy interface, but users will need serious understanding of what this powerhouse uses for fuel, which requires rigorous user training.
Pluses and minuses
The two main analyses modules are K2R (Robust Regression) and K2S (Smart Segmenter). K2R creates predictive models, devouring historical data to calculate future probabilities.
K2S creates cluster analyses for market-segmentation efforts -- critical for classic market analysis and partitioning the total universe of customers into trait-sharing groups. Many tools expect the analyst to dictate the definition of a cluster, but K2S defines a model for the analyst, saving time and effort. Instead of guessing who your prime clusters are, the software maps rows of data into clusters for fast, true analysis.
The tool has two other noteworthy capabilities. It will omit some data from any given cluster, and it will include some instances in multiple clusters, reflecting the reality outside computers better than the pure either/or arrangements of most BA tools. This makes for a powerful segmentation tool with a leading set of clustering methods.
I would like to see improvement, however, in the ability to see the definition of each cluster. Currently, you must save the model as SQL and read the statements; it's perfectly functional, but evolving toward natural-language description would let a broad range of domain experts and executives readily understand the K2S information.
Analytic Framework also simplifies the value of a drafted model by focusing on a pair of numbers from statistical learning theory. The first quantity is KI, which indicates how well the model describes the proportion of the target variables' information "explained" by the other variables -- in other words, how "true" is the model.
The second quantity is KR, which measures the current model's capability of achieving the same performance when used on another set of data with the same attributes -- that is, how reusable is the model. Because there are only two key measures against which to optimize, Analytic Framework greatly simplifies the user's work, opening the information up to a wider, less sophisticated user base.
To improve the analytics model further, Analytic Framework delivers reports on each variable's contribution, giving the user an opportunity to weigh variables and to exclude or combine them to create different scenarios. The reporting here is both graphical and numeric, and although simple in presentation, it will be best absorbed and acted on by a more BA-sophisticated user.
Export and consolidate
After building a model that meets your criteria, you can use the Modeling Assistant and KMX (Model Export) module to save it for reuse or to export the model as program code in several languages, including C, ANSI SQL, SQL for SQL Server, XML, and formats for products from SAS and SPSS. This export capability separates the analysis effort from the coder's domain while serving both in concept.
Analytic Framework also has a module for consolidating data, called KEL (Event Log). KEL supports efforts that require preprocessing, such as summing or averaging tasks. It takes some getting used to, especially the tight prescription of date formats, but otherwise works exactly as advertised.
Overall, KXEN Analytic Framework 3.0.2 is a powerful package with high-power mathematical muscle unequaled by most BA tools. It is most valuable for processing databases with lots of fields that can't be handled by most tools, but the real benefits come when it's used as one piece of an overall solution: Integrated with existing BI tools or other data mining systems, this system can significantly magnify strategic insights.