IBM Simplifies the Enterprise Data Stack for Gen AI Era : US Pioneer Global VC DIFCHQ SFO NYC Singapore – Riyadh Swiss Our Mind

-As organizations scale agents and other advanced AI applications, IBM is connecting them to essential unstructured data

-New watsonx.data integration and watsonx.data intelligence enable data fabric for enterprise Gen AI and complement watsonx.data, IBM’s hybrid, open data lakehouse

-Achieve 40% more accurate AI than conventional RAG with watsonx.data, according to IBM internal testing

This month IBM is radically simplifying the enterprise data stack, introducing software that unifies, governs, and activates the unstructured enterprise data necessary to power AI agents and other advanced AI applications.

The two new products include IBM watsonx.data integration and IBM watsonx.data intelligence. Select capabilities from both products will also be available through watsonx.data, IBM’s hybrid, open data lakehouse for managing the entire data-for-AI lifecycle in a single experience.

The new software is hybrid and open, connecting with third-party data stacks to provide flexibility, interoperability, and drive innovation from across the ecosystem. The new products can enable 40% more accurate AI than conventional RAG, according to testing with watsonx.data.[1]

IBM client Lockheed Martin recently leveraged the transformed watsonx.data, enabling 70,000 engineers, scientists, and technicians to retrieve answers and information from millions of documents using natural language. “We are rapidly accelerating our innovation and efficiency, to get solutions out of the lab and into the field, helping create a safer, more secure world,” says John Clark, senior vice president of Technology and Strategic Innovation at Lockheed.

The context

Businesses need generative AI – and increasingly, agentic AI – to drive innovation, unlock productivity, and remain competitive. And generative AI needs enterprise-specific data to be accurate and performant. 72% of business leaders view their organization’s proprietary data as key to unlocking the value of generative AI, according to IBM’s new CEO study.

But this coveted data is often unstructured and difficult to harness, trapped inside emails, PDFs, presentations, and videos. Conventional RAG cannot handle the scale and complexity of unstructured data or properly combine it with structured data. Meanwhile, a range of fragmented tools make the data stack complex and cumbersome.

As a result, enterprises’ unstructured data – which can constitute up to 90% of their total data, according to IDC – is largely underutilized and not represented in their AI agents and other generative AI applications.

The details

Watsonx.data integration introduces a new, unified data integration control plane designed to scale the delivery of AI-ready data. Data engineers can bridge across low-code, code-first, and agentic tools, allowing different authoring entry points. The software orchestrates data movement across diverse integration styles and features bulk and batch ETL/ELT, real-time streaming, data replication, and data observability capabilities across structured or unstructured data. With flexibility and adaptability at its core, data teams no longer need to navigate fragmented tooling or incur new technical debt with every new data storage paradigm shift – future-proofing their data infrastructure.

Watsonx.data integration is available as a standalone product beginning June 11. Its unstructured data integration and observability capabilities can also be leveraged through watsonx.data.

Watsonx.data intelligence transforms how organizations curate, manage, and utilize data, leveraging the power of AI to simplify data delivery across hybrid ecosystems. The software unifies data governance, quality, lineage, and sharing, empowering organizations to discover, trust, and access meaningful data.

Watsonx.data intelligence is available as a standalone product beginning June 11. Its capabilities can also be leveraged through watsonx.data for data under management in the lakehouse.

Additional data innovations

Following IBM’s acquisition of DataStax, IBM will continue to integrate its tools and technologies into watsonx.data. These include Astra DB and Hyper-Converged Database, which provide NoSQL and vector database capabilities powered by the open-source Apache Cassandra® and will be available June 11.

In June, IBM will debut watsonx BI, an AI analytics agent that rewires how teams engage with data and harnesses its power to deliver exceptional business intelligence — using natural language. The agent can answer marketing, sales, operations, finance, and other domain questions in seconds and provides step-by-step explanations of its reasoning. Watsonx BI will be available as a standalone product and will also be available through watsonx.data.

IBM is integrating Gluten Accelerated Spark into watsonx.data to enhance performance of compute-intensive Spark SQL workloads. The technology can help accelerate query processing and enhance resource efficiency for large-scale data analytics, at a time when enterprise data volume is growing fast.

IBM also recently announced the addition of watsonx as an API provider within Meta’s Llama Stack, enhancing enterprises’ ability to deploy generative AI at scale and with openness at the core. Watsonx.data’s Milvus database is already part of the Llama Stack framework, and this integration enables additional unstructured and structured data management and agentic retrieval.

IBM recently unveiled new agentic AI tools for the enterprise, including pre-build domain-specific agents, watsonx Orchestrate Agent Builder, and agentic AI governance capabilities. Paired with today’s robust data offerings, organizations will have the data and tools necessary to successfully deploy agentic AI at scale.

 

[1] Based on internal testing comparing the accuracy of AI model outputs using watsonx.data retrieval layer to vector-only RAG on three common document sets, using the same open source commodity inferencing, judging and embedding models and additional variables.

https://newsroom.ibm.com/blog-ibm-simplifies-the-enterprise-data-stack-for-gen-ai-era