Computerworld

Teradata brings MapR Hadoop into the data warehouse

Enterprises benefit from extending their data warehouse architectures to handle big data workloads, Teradata says

Enterprise data sets have gotten so voluminous that they can't fit into even the largest data warehouses anymore, many businesses find. Now, companies running these overstuffed data stores have an on-ramp to newfangled, big-data style processing through a combined effort between analytics systems supplier Teradata and Apache Hadoop distribution provider MapR.

The partnership is aimed at letting users of systems based on the Teradata Unified Data Architecture seamlessly use MapR's distribution of the Hadoop open-source software framework for the distributed processing of big data.

Combining the control tools and support of Teradata -- a long-familiar name to enterprises -- with a commercially refined Hadoop distribution such as MapR's offers organizations a potentially easy way to incorporate big data analysis in their operations, without the administrative headaches of setting up and running Hadoop from scratch, Teradata says.

The partnership will also benefit the dozens of enterprises already using both MapR and Teradata.

Teradata software such as Teradata QueryGrid and Teradata Loom, designed to orchestrate work processes, will work with the MapR software. For the integration work, Teradata has prepared a connector to MapR. This allows organizations that have QueryGrid to use MapR to process data from Teradata databases and other sources.

The two companies have also reconciled their roadmaps so their respective products can continue to be integrated with each other.

A leader in the field of high performance data warehouse products, Teradata has been expanding the scope of its technology to include sources outside of data warehouses. Data warehouses are used to collect data from databases, to be scrutinized with complex analysis. The Apache Hadoop data processing platform, often packaged in commercial releases by companies such as MapR, can hold vast reams of data, typically more than can be stored in a data warehouse.

Extending data warehouse and other database tools to incorporate the relatively new Hadoop technologies is becoming an increasingly common strategy to introduce Hadoop to the enterprise. Earlier this week, Hewlett-Packard announced that it had integrated its Vertica columnar oriented database with Hadoop, allowing users to query Hadoop databases with the widely used SQL (Structured Query Language).

Although not a Teradata customer, security services company Solutionary has used MapR to expand its analysis capabilities beyond traditional database analysis tools, while cutting hardware and software costs.

The company has two sets of customer data: It stores all the security and events logs from its customers networks on a set of file servers, and keeps another set of metadata about these events on a data mart running on Oracle Real Application Cluster. The company used MapR to merge these two sets of data together. MapR also provided a way to use commodity storage to keep hardware and software licensing costs down, while giving the company more computational power to do predictive modeling. Using Hadoop also allows Solutionary to offer additional features for their customers, such as a log search.

"It was a real natural fit for where we were at and where we wanted to go," said Scott Russmann, Solutionary's director of software engineering.

Teradata will provide full technical support for its customers using MapR. The company will also offer consulting services to help customers set up their Hadoop distributions, prepare the data for Hadoop analysis, and develop a set of analysis tools.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com