The excitement around generative AI (gen AI) and its massive potential value has energized organizations to rethink their approaches to business itself. Organizations are looking to seize a range of opportunities, from creating new medicines to enabling intelligent agents that run entire processes to increasing productivity for all workers. A raft of new risks and considerations, of course, go hand in hand with these developments. At the center of it all is data. Without access to good and relevant data, this new world of possibilities and value will remain out of reach.
Building on our interactive “The data-driven enterprise of 2025,” this article is intended to help executives think through seven essential priorities that reflect the most important shifts that are occurring, what the main complexities are, and where leaders can focus their energy to realize the data-driven enterprise of 2030.
Everything, everywhere, all at once
By 2030, many companies will be approaching “data ubiquity.” Not only will employees have the latest data at their fingertips, as we highlighted in “The data-driven enterprise of 2025,” but data will also be embedded in systems, processes, channels, interactions, and decision points that drive automated actions (with sufficient human oversight).
Quantum-sensing technologies, for example, will generate more precise, real-time data on the performance of products from cars to medical devices, which applied-AI capabilities will be able to analyze to then recommend and make targeted software updates. Gen AI agents informed by detailed historical customer data will interact with digital twins of those same customers to test personalized products, services, and offers before they are rolled out to the real world. Clusters of large language models (LLMs) working together will analyze individual health data to derive, develop, and deploy personalized medicines.
Some companies are already embracing this vision, but in many organizations, few people understand what data they really need to make better decisions or understand the capabilities of data to enable better outcomes.
Essential actions for data leaders
Enabling these visions of advanced technologies requires the data leader to activate the organization so it thinks and acts “data and AI first” when making any decision. That means making data easy to use (by creating standards and tools for users and systems to easily access the right data), easy to track (by providing transparency into models so users can check answers and automated outcomes), and easy to trust (by protecting data with advanced cyber measures and continually testing it to maintain high accuracy).
Data leaders will need to adopt an “everything, everywhere, all at once” mindset to ensure that data across the enterprise can be appropriately shared and used. That includes, for example, clearly defining and communicating data structures (that is, data hierarchies and fields) so teams understand the standards needed for a given data set and establishing clear business rules (such as naming conventions or types of data that are acceptable to collect), which will need to be revisited frequently as models, regulations, and business goals evolve.

Unlocking ‘alpha’
Two core characteristics of many recent technologies—for example, gen AI, low code and no code, and small language models (SLMs)—are how easy they are to use and how rapidly they have proliferated. Vendors, for instance, are integrating gen AI into their offerings; start-ups are quickly rolling out new tools and models; and large swaths of people are using gen AI to help with their work. Sixty-five percent of respondents to a recent McKinsey survey say their organizations are regularly using gen AI in at least one business function, up from a third last year.1
The problem with this mass adoption is that many organizations are using the same tools or developing similar capabilities, which means they’re not creating much competitive advantage. It’s as if everyone chose to use the same bricks to build a house that looks just like the one next door. The value, however, comes not just from the bricks themselves but also from how they are put together—the vision and design for assembling those bricks into a home that people will want to buy.
Essential actions for data leaders
To unlock “alpha” (a term investors use for obtaining returns above benchmark levels) with gen AI and other technologies, data leaders need to have a clear focus on data strategies that can deliver competitive advantage, such as the following:
- Customizing models using proprietary data. The power of LLMs and SLMs comes from a company’s ability to train them on its own proprietary data sets and tailor them through targeted prompt engineering.
- Integrating data, AI, and systems. Value is increasingly coming from how well companies combine and integrate data and technologies. Integrating gen AI and applied-AI use cases, for example, can create differentiating capabilities, such as using AI to develop predictive models for user behavior data and feeding those insights to gen AI models to generate personalized content.
- Doubling down on high-value data products. The lion’s share of the value a company can derive from data will come from about five to 15 data products—treated and packaged data that systems and users can easily consume.
Capability pathways: From reacting to scaling
The ease of use of many basic tools and their increasing availability have generated a proliferation of often-disconnected use cases, pilots, and features. The enthusiasm around gen AI in particular means that data leaders no longer have to push the value of data on their colleagues. Instead, they’re struggling to manage the “pull.” This results in two issues: first, teams across the enterprise launch proof-of-concept models and AI-based applications that have no chance of scaling (“pilot purgatory”), and second, various stakeholders invest in heterogeneous use cases that require wide-ranging modules from the data and AI stack and the building of entire architectures at once before value can be realized.
To enable the scale required to operate data-driven businesses in 2030, data leaders will need an approach that accelerates how use cases provide impact while solving for scale through an architecture that can support the enterprise. To achieve this, data leaders need to build “capability pathways,” which are clustered technology components that enable capabilities that can be used for multiple use cases (Exhibit 1).
How to develop and sustain capability pathways depends in part on thinking through critical-data-architecture choices. The choices generally break down between a centralized approach, with a carefully managed data lake house, for instance; a decentralized approach, whereby local business units have full ownership over their data; and a federated approach that might use a data mesh.
A decentralized approach will make it difficult to create capability pathways that can be used across the enterprise, while a more centralized approach requires additional investment in governance and oversight capabilities. The choice of hyperscaler (for example, cloud service provider), with its set of embedded tools and capabilities, will also influence how to develop capability pathways.


An automotive company wanted to create capabilities to offer a range of personalized services and communications with its customers. To meet this need, it decided to develop two capability pathways.
The first one was an AI and machine learning capability pathway to perform deep analysis and segmentation of the company’s customers. To build this pathway, the company pulled together a number of elements, including a PySpark machine learning library (for clustering and propensity analysis), Databricks for file storage, and Futurescope for model management using MLflow. The other capability pathway was for personalized communication made up of LLMs, a sales data warehouse, marketing technologies to send and track email performance, and a customer-360 data set and external data from Experian for customer interests and demographics, among other technical elements.
With these capability pathways, the company was able to segment customers into highly refined archetypes, send them personal offers, provide personalized prompts to service operations to follow up with customers, and deliver personalized behavioral information for sales staff.
Living in an unstructured world
For decades now, companies have been working with structured data (for instance, SKUs, product specifications, transactions and balances organized by master and reference data). That’s just 10 percent of the data available, however. Gen AI has opened up the other 90 percent of data, which is unstructured (for example, videos, pictures, chats, emails, and product reviews).
This windfall of data can greatly enrich companies’ capabilities, especially when combined or integrated with other data sources. Examples might include using reviews, social media posts, and purchase history to enable gen AI agents to create highly personalized customer offers or analyzing contracts and terms from past business dealings so gen AI agents can manage vendor negotiations, onboarding, fulfillment, and contract updates.
But the scale and variety of the unstructured data are a more geometrically complex issue. By definition, unstructured data is less consistent, less available, and harder to prepare and cleanse—made all the more challenging by the scale of data. As an analogy, it’s like putting in the effort to develop and manage the pipelines and systems for drinking water and suddenly being tasked with managing an ocean of water. And with data volumes expected to increase by more than ten times from 2020 to 2030, this issue is not going to get easier anytime soon.2
Essential actions for data leaders
Creating value from unstructured data is a much bigger and more time-intensive effort than many realize. Significant challenges include cleansing and tagging requirements, privacy and bias concerns, skyrocketing cloud storage and networking costs, and often expensive conversion processes. Data leaders will need to invest in building new capabilities such as natural-language processing to help convert the unstructured data so that LLMs can “understand” and use it, as well as in testing and recalibrating LLMs continually as models and corresponding data sources are updated.
Crucially, data leaders will need to stay focused on “unlocking alpha” in managing the flood of unstructured data. That means investing time to map which parts of unstructured data are needed to best achieve business priorities and critical data products.
Data leadership: It takes a village
The ability of companies to achieve their data and AI vision by 2030 will rely substantially on leadership. To date, the story on this score has been a bit of a mixed bag. Only half of chief data and analytics officers, for example, feel they are able to drive innovation using data.3 Even high-performing companies struggle.4 Seventy percent of these organizations report difficulties, for instance, in developing processes for data governance and integrating data into AI models quickly.5
This issue often comes down to unclear responsibilities, narrow skill sets, or disconnected governance. In some cases, data leaders are focused on risk but are disconnected from the business leaders who need to use data to generate revenue. In others, leaders have a clear mandate to accelerate value creation within specific business areas but with limited enterprise perspective, resulting in siloed capabilities and subscale solutions.
Essential actions for data leaders
To get on the right track, companies need to find leaders who are skilled in three major areas:
- governance and compliance, with a heavy focus on defensive activities (driven primarily by regulation and cyber risk); these types of leaders are found primarily in high-compliance industries or those with high information value
- engineering and architecture, with a focus on technical design and looking at every problem as an engineering opportunity to automate, reuse, and scale
- business value, with a focus on generating revenue, growth, and efficiency from data; these leaders often work closely with the business
Finding a single person with the skills, mindset, and experience to cover all three roles is rare. Empowered data leaders, however, can fill out their teams with people who have the right mix of skills, or organizations can create an operating committee representing each capability area. Whichever model is chosen, it will require explicit sponsorship from the top, discussions with broader leadership on roles and responsibilities, shared accountability, and common incentives to solve for all three disciplines.
The new talent life cycle
The talent profiles of organizations will likely look very different in 2030. Gen AI and automation technologies are already starting to take over basic analytical and process tasks, such as code generation, document creation, and data classification and synthesis. Over time, we can expect gen AI and other technologies to handle more sophisticated tasks, such as lineage production and data product development, while the supply of talent shifts and new jobs emerge.
Essential actions for data leaders
These shifts in the way work is done require data and AI leaders to develop a clear view of what new skills are needed. Some of these new skills will be absorbed into existing roles, while others will require completely new roles (Exhibit 2). Data engineers, for example, will need to develop a new range of skills, such as database performance tuning, data design, DataOps (which combines DevOps, data engineering, and data science), and vector database development. New roles might include prompt engineers, AI ethics stewards, and unstructured-data specialists.
This skills shift will require data leaders to work with HR leadership to rethink how to find and train people for the skills they need. Companies, for instance, will need to develop both apprenticeship programs in which senior data experts dedicate time to training talent and learning programs built around discrete skills modules.
In the drive to upskill talent, data leaders must not forget culture. McKinsey analysis shows that gen AI developers and heavy users care most about reliable and supportive people, as well as caring and inspiring leaders: roughly two in five say that meaningful work and an inclusive community are core motivators, even above flexibility.6
Guardians of digital trust
Risk has become much more of an area of concern with the rise of advanced technologies—most notably AI and gen AI. Governments are moving quickly to roll out new regulations, and companies are evaluating new policies.
Some of the issues have been well known, such as hallucinations (that is, gen AI models providing inaccurate answers), bias, intellectual property rights, and data privacy. But since these technologies are so new and evolving quickly, the broader risk landscape is often not well understood. Three types of risk stand out:
- New types of attacks. The power of gen AI to learn and evolve quickly is opening the door to completely new types of attacks, including self-evolving malware that learns internal systems and evolves to breach defenses, intelligent bots that can increasingly mimic humans, and infected data that is inserted into models training.
- Broadening landscape for risk. The broad interconnections between AI and data systems—both within and outside of enterprises—have created a significantly greater area for damage to be done.
- New ‘unknowns.’ As interacting with AI becomes more conversational and less about just searching for facts, companies will enter a much more ambiguous zone defined by varying value systems. And with the proliferation of gen AI agents essentially “talking” with each other, completely new categories of risk will likely emerge.
Essential actions for data leaders
In addition to keeping abreast of these emerging risk types, data leaders will need to rethink their approaches to risk. Many still rely too much on traditional data quality and compliance approaches, while few have started to implement advanced coding and ethics testing. This reevaluation should be underpinned with the understanding that risk management is a competitive advantage, achieved either by building a brand that is a safe custodian of customer livelihoods or by simply avoiding the failures that competitors might face. That view should drive a more proactive posture to addressing risks than simply hitting compliance benchmarks.
Data leaders (and tech leaders more broadly) can keep up with the scale of cyber issues by implementing AI (and eventually quantum) capabilities, such as “adversarial” LLMs to test LLM-generated emails for inappropriate or illegal content, and fairness tool kits to test for bias.
While tools developed by third parties can be helpful, advanced AI security shouldn’t be farmed out. Data leaders need to be mindful about building up their own capabilities to keep up with the pace of the market.
https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/charting-a-path-to-the-data-and-ai-driven-enterprise-of-2030