Neural Processing Units (NPUs) and the Evolving AI Hardware Market: Will NVIDIA Stock Follow Cisco History? US Pioneer Global VC DIFCHQ SFO Singapore – Riyadh Swiss Our Mind

Prepared by Open AI Deep Research : Under direction of Dr Biplab Pal ( bpal1@umbc.edu)

Introduction
The surge of artificial intelligence (AI) has sparked a race for faster and more efficient
hardware. Graphics Processing Units (GPUs) – particularly those from NVIDIA – have so far
dominated AI workloads, fueling NVIDIA’s rapid growth and lofty market valuation. Now, a
new class of specialized chips called Neural Processing Units (NPUs) is emerging,
promising superior efficiency for AI tasks. This analysis examines whether NVIDIA’s current
dominance might follow the same trajectory as Cisco’s rise and decline during the internet
boom. We compare Cisco’s historical arc to NVIDIA’s situation today, explore the
technological differences between GPUs and NPUs, survey the NPU market and key
players, and assess the financial implications for NVIDIA and its competitors.

[First, GPUs were never originally designed for AI; they dominated the AI market only because
there was no viable alternative like NPUs. However, building AI cloud infrastructure with GPUs is
highly inefficient due to their massive water cooling requirements and enormous electricity
costs. NPUs are set to revolutionize this space by reducing electricity consumption by over 90%.
Already, several major companies are pausing their Blackwell orders due to overheating issues.

Second, while NVIDIA has thrived in the PC and gaming markets, NPUs are poised to outperform
GPUs in both power efficiency and computational capability. As NPUs gain traction, they will
likely replace GPUs in these domains as well.

Third, the pursuit of generalized large language models (LLMs) is becoming increasingly
impractical. The industry is shifting toward Mixture of Experts (MoE) models with specialized
vertical expertise and enhanced reinforcement learning. These architectures require far fewer
processors than traditional LLMs, reducing the demand for massive computational
infrastructure.
Given these trends, NVIDIA's current growth trajectory may closely mirror Cisco's historical
decline.]
Historical Market Comparison: Cisco’s Rise and
NVIDIA’s Dominance
Cisco’s Rise and Fall: In the 1990s, Cisco Systems was at the forefront of the internet
revolution. As businesses and consumers rushed online, demand for Cisco’s networking
equipment exploded. Between 1995 and 2000, Cisco’s revenue surged by 850% (from about
$2 billion to $19 billion)​
hardingloevner.com
. Its stock price grew an astonishing 3,800% in that period, from roughly $2 to $79 per share​
hardingloevner.com
. By March 2000, at the peak of the dot-com bubble, Cisco was the world’s most valuable
company with a market capitalization over $500 billion​
hardingloevner.com
. However, the dot-com crash brought a swift reversal. Cisco’s shares plunged 88% from
their peak, falling to about $9.50 by 2002​
hardingloevner.com
. Notably, this stock collapse wasn’t due to an immediate collapse in Cisco’s business – the
company’s revenues stayed around $19–22 billion during 2000–2002 – but rather a deflation
of hype and valuation multiples​
hardingloevner.com
. In the two decades after, Cisco continued to grow its top line (annual sales reached nearly
$52 billion by 2022, more than double its 2000 revenue)​
hardingloevner.com
, yet the stock never again reached its 2000 highs and took 20 years to fully recover on a
total-return basis​
hardingloevner.com
. Cisco transitioned into a mature, steady company but lost the “unstoppable” aura it once
had. Analysts have pointed to factors like market saturation, increased competition, and a
shift in strategy post-2001 (with Cisco spending massive cash on stock buybacks instead of

aggressively innovating in emerging areas like 4G/5G wireless) as reasons for its slowed
momentum​
. In short, Cisco’s story became a cautionary tale of how even a market leader can stagnate
once the tech landscape shifts and exuberant valuations fade.
Parallels with NVIDIA: NVIDIA in the late 2010s and early 2020s has drawn comparisons to
Cisco’s 1990s dominance​
. NVIDIA is riding the AI wave much as Cisco rode the internet wave. Over the last several
years, NVIDIA’s growth has been dramatic – annual revenue more than doubled from 2020
to 2023, fueled by demand for its GPUs in AI data centers​
. In its third quarter of 2023, NVIDIA reported an eye-popping $18.1 billion in revenue
(tripling year-over-year) with $10 billion in profit​
, reflecting the voracious demand for AI hardware. Its stock price similarly soared, making
NVIDIA one of the rare trillion-dollar companies. In fact, NVIDIA’s stock climbed over 700%
in the five years leading up to late 2023​
. Like Cisco at its peak, NVIDIA has enjoyed “nose-bleed” valuation multiples – at one point
trading around 118× earnings​
– based on investors’ high hopes for AI. NVIDIA’s CEO Jensen Huang has been as bullish
about the future as Cisco’s CEO John Chambers once was; Huang recently proclaimed that
in the last 40 years, nothing has been as big as the current AI boom, calling it “bigger than
PC, …bigger than mobile, and …bigger than the internet, by far”​
. This exuberance echoes Chambers’ late-90s rhetoric about the transformative power of
networking​
The parallels are clear: both Cisco and NVIDIA achieved dominant market positions in
paradigm-shifting tech trends, saw their stock valuations skyrocket, and faced a chorus of
“this will change everything” optimism. Market Dynamics and Strategic Shifts: In both
cases, emerging competitors and technological shifts lurked behind the scenes. For Cisco,
the early 2000s brought competition from firms like Juniper in core routers and a
commoditization of some hardware (with industry-standard “white box” switches eroding
Cisco’s premium). Likewise, NVIDIA now faces a wave of competition from alternative AI
chips (NPUs/ASICs) and rival GPU makers. Furthermore, both companies had to contend
with changes in how their technologies were deployed: Cisco had to adapt to a world moving
from purely enterprise hardware to cloud-scale networking and software-defined solutions,
while NVIDIA must navigate a future where cloud providers and device makers consider
designing their own AI chips in-house. A key lesson from Cisco is that sustaining dominance
requires continuous innovation and adaptation even after the initial boom. Cisco’s heavy
focus on its existing business and (as some analysts argue) on financial engineering like
stock buybacks​

left it less agile when new opportunities (e.g. mobile networking or cloud infrastructure)
arose. The question is whether NVIDIA can avoid a similar plateau by pivoting and
expanding its technology as the market evolves.
Key Differences: Despite the parallels, there are important differences between Cisco’s
scenario and NVIDIA’s. Cisco’s dot-com era valuation was largely a speculative bubble –
when it burst, the overall demand for networking gear slowed, and Cisco’s core market
matured. In contrast, AI demand driving NVIDIA’s growth is seen by many as more
sustained and substantive (AI adoption is still in early stages across many industries).
Moreover, NVIDIA has a critical asset Cisco lacked to the same degree: a sticky software
ecosystem. NVIDIA’s CUDA platform and libraries have become the de facto standard for AI
development, creating high switching costs. This software moat means customers are
deeply invested in NVIDIA’s platform, whereas networking hardware in the 2000s was more
interchangeable once standards equalized. Additionally, NVIDIA has been aggressively
investing in new areas (automotive AI, edge AI, networking with its Mellanox acquisition,
etc.) rather than resting on its GPU laurels. These differences could mean NVIDIA’s
trajectory might diverge from Cisco’s, even if short-term market expectations are similarly
high. Still, the cautionary tale remains: Cisco’s experience shows that even a market leader
can fall behind if a disruptive technology shift occurs and if competitors (or customers
themselves) find a better solution. In NVIDIA’s case, that potential disruptor is the rise of
NPUs and custom AI chips.
Technological Analysis: GPUs vs. NPUs in AI
Workloads
To understand the shifting hardware landscape, it’s crucial to examine how Neural
Processing Units (NPUs) differ from GPUs and why NPUs are gaining traction for AI
workloads. Both GPUs and NPUs are processors optimized for parallel computation, but
they have divergent design philosophies tuned to different needs.
Architectural Design and Specialization: GPUs (Graphics Processing Units) were
originally designed to accelerate image rendering and graphics. They work by breaking
down complex image processing tasks into many smaller operations that can be executed in
parallel​
ibm.com
. Modern GPUs consist of hundreds or thousands of cores that can perform mathematical
operations simultaneously, which turned out to be extremely useful for AI computations (like
matrix multiplications in neural networks). However, GPUs are still relatively general-purpose
– they retain a lot of circuitry and features (like texture mapping, shading, and floating-point
versatility) that support a broad range of tasks beyond AI. NPUs, on the other hand, are
purpose-built only for neural network computations. An NPU’s design is inspired by the
structure of the human brain’s neural networks, with hardware modules specifically to speed
up the fundamental operations of AI (namely the multiplication and addition of
matrices/vectors) and to maximize data flow efficiency via on-chip memory​

. In practice, NPUs often consist of massive arrays of simplified processing elements (e.g.
MAC units for multiply-accumulate) accompanied by specialized memory hierarchies to keep
data on-chip as much as possible. By shedding the “excess” functionality that a GPU carries,
NPUs can devote more silicon area and power budget to the few operations that dominate
AI workloads​
. This single-minded specialization is what gives NPUs an edge in certain performance
metrics.
Parallelism and Performance Efficiency: While both GPUs and NPUs excel at parallel
processing, NPUs take it a step further in a more domain-specific way. GPUs offer excellent
parallel compute capabilities but with the trade-off of high power consumption and general-
purpose overhead​
i
. NPUs, by focusing on the repetitive, structured computations of neural networks, can
achieve equal or even greater parallel throughput with significantly less power usage​
. In other words, NPUs are designed to squeeze the maximum number of tera-operations
per second out of each watt of power, for AI tasks. For example, in deep learning inference
(running a trained model), an NPU might only use a fraction of the energy that a GPU would
to achieve the same throughput, because the NPU wastes no energy on unnecessary tasks.
This efficiency advantage is especially pronounced for short, repetitive calculations common
in neural network layers​
. In fact, NPUs often outperform GPUs on metrics like inferences-per-second-per-watt. A
practical illustration of this is in mobile devices: smartphone NPUs can run tasks like image
recognition or voice processing in real time without draining the battery, something that
would be impractical with a GPU – as one analysis notes, NPUs can meet similar
performance benchmarks to GPUs while using exponentially less power​
.
Scalability and Use Cases: GPUs have proven versatility – they are used from desktops to
large data-center servers. NPUs, being newer, are often used as complements to GPUs or
in specialized contexts rather than outright replacements (at least so far). In many AI
systems today, a hybrid approach is employed: a GPU might handle general processing and
model training, while an NPU is tasked with accelerating specific inference workloads or
offloading parts of the computation. For instance, an NPU can be embedded alongside a
GPU, shouldering the repetitive matrix multiplications in a neural net and freeing the GPU to
handle other parts of the application​

. This is seen in some edge computing setups where a GPU and NPU work in tandem – the
NPU provides low-latency processing for sensor data (say, identifying objects in a camera
feed), and the GPU can still be used for rendering or other parallel tasks. NPUs shine in low-
latency inference and real-time AI processing. By doing AI computations locally (on-device
or at the network edge) instead of sending data to a distant data center, NPUs enable

features like instant face recognition in smartphones or collision avoidance in cars with
minimal delay​
ibm.com
. This localization also enhances privacy and autonomy, as sensitive data need not leave the
device. In data centers, NPUs (or their cousins, AI ASICs) are being used to accelerate
large-scale workloads like search engine queries or translation – for example, Google’s
Tensor Processing Unit (TPU) is an NPU designed to handle massive neural network
computations for services like Google Search and Photos​
globenewswire.com
.
GPUs’ Versatility vs NPUs’ Efficiency: The fundamental trade-off between GPUs and
NPUs is versatility versus efficiency. A GPU is a flexible workhorse; developers have
decades of tooling and experience (especially with NVIDIA’s CUDA software stack) to
program GPUs for any kind of parallel task. Indeed, NVIDIA’s CUDA ecosystem is a major
asset – it allows relatively easy programming of GPUs for AI, and it’s widely adopted in
industry and academia​
ibm.com
. NPUs by contrast often come with proprietary or less-mature software support, making
them harder to program and less accessible outside of specific partners or products​
ibm.com
. For example, Google’s TPU chips are powerful but available mainly through Google Cloud,
not as off-the-shelf products​
ibm.com
. This means that today GPUs still enjoy a lead in developer mindshare and ease-of-use.
However, when it comes to raw performance on neural network tasks, NPUs can surpass
GPUs because they eliminate general-purpose baggage. As one industry source puts it,
NPUs are so tailored to AI computations that they “surpass GPUs in handling the most
complex [AI] workloads like deep learning inference and training”​
blog.purestorage.com
– an assertion supported by the success of chips like Google’s TPU in training large
language models and the use of NPUs in top-performing supercomputers for AI. Another
analysis notes that GPUs, due to their general-purpose nature, can struggle to compete with
NPUs in specific areas such as processing very large language models or performing AI
tasks in tiny edge devices​
blog.purestorage.com
. In summary, GPUs offer broad applicability and a mature ecosystem, whereas NPUs offer
superior efficiency and potentially higher performance-per-dollar for targeted AI workloads.
This is not an either/or choice in many cases – the two technologies are often used together.
But as AI usage scales up, the incentive to use more NPUs (to save on power and cost)
grows, which is exactly why NPUs are gaining ground.
Market Landscape: The Rise of NPUs and Key Players
The ecosystem for AI hardware is rapidly expanding beyond GPUs. A few years ago, if you
needed to train a neural network or deploy AI at scale, NVIDIA GPUs were almost the only

game in town. Today, dozens of companies and research groups are developing Neural
Processing Units or similar AI accelerators. This shift is driven by the pursuit of better
efficiency, lower cost at scale, and independence from a single supplier. Below we outline
the key players in the NPU/AI accelerator space and the trends in adoption:
● Google (Alphabet) – Tensor Processing Unit (TPU): Google was one of the first big
movers to design a custom AI chip. Starting in 2016, Google’s TPUs have been used
internally to accelerate search rankings, language translation, image recognition, and
more. Google’s latest Cloud TPU v5p pods contain 8,960 TPU chips each,
engineered to train enormous models like GPT-style large language models​
techtarget.com
. Google reported that its TPU-based infrastructure can handle over 1 billion photo
analyses per day for services like Google Photos​
globenewswire.com
. By using TPUs in its data centers, Google reduced its need for NVIDIA GPUs for
certain tasks, illustrating how a well-designed NPU can take over large AI workloads
efficiently. Google doesn’t sell TPU chips directly, but offers their power through
Google Cloud, indicating a trend: these custom NPUs often remain in-house
advantages or cloud services rather than commercial products.
● Amazon Web Services (AWS) – Trainium and Inferentia: Amazon, via its
Annapurna Labs acquisition, developed the Trainium chip for AI training and the
Inferentia chip for AI inference. AWS’s motive is to reduce dependence on NVIDIA
for its cloud offerings. AWS now offers EC2 Trn1 instances built around Trainium
accelerators, specifically to train deep learning and generative AI models​
techtarget.com
. A single AWS Trn1 instance can include 16 Trainium NPU chips working in parallel​
techtarget.com
. For deploying models, AWS uses Inferentia, which provides high-throughput
inference at low cost​
techtarget.com
. By offering these to cloud customers, Amazon not only saves cost internally but is
actively enticing AI workloads to run on its custom silicon. This reflects a broader
adoption trend: cloud providers are increasingly offering GPU alternatives to
customers. Amazon’s move is paying off in certain segments where customers
optimize for price-performance; for example, AWS claims substantial cost savings for
inference using Inferentia versus GPU instances.
● Meta (Facebook) – Meta has also embarked on custom AI silicon projects to support
its massive AI deployments (such as content filtering, recommendation algorithms,
and metaverse initiatives). While details are less public, reports indicate Meta is
developing in-house chips for both inference and training. Notably, Meta, along with
other hyperscalers, is motivated by the “massive financial burden” of acquiring
thousands of high-end NVIDIA GPUs​
lightreading.com
. By investing in proprietary NPUs, Meta hopes to tailor hardware to its needs and
avoid being bottlenecked by NVIDIA’s supply or pricing. In 2024, it was reported that
Meta, Google, Amazon, and Microsoft all have strategic programs to forge ahead
with in-house semiconductor solutions to reduce their Nvidia dependency​
lightreading.com

. This concerted effort by multiple giants underscores a major market shift: the
biggest buyers of AI hardware are becoming competitors in chip design.
● Microsoft – Microsoft’s Azure cloud has historically relied on a mix of NVIDIA GPUs
and even FPGA accelerators (Project Brainwave using Intel FPGAs) for AI. Microsoft
is reportedly working on its own AI chip as well (often codenamed “Athena” in press
reports). Additionally, Microsoft invested in OpenAI and is likely motivated to have
more control over the hardware that powers services like Azure OpenAI and Bing’s
AI features. Like others, Microsoft aims to curb costs and ensure supply by
developing custom AI accelerators​
lightreading.com
. They have also collaborated with AMD for potential alternative AI chip solutions and
invested in partnerships (e.g., with an Abu Dhabi AI group G42) to diversify sources​
lightreading.com
. All these moves indicate Microsoft’s desire not to rely solely on NVIDIA long-term.
● Apple – Apple pioneered on-device NPUs for mobile with its Apple Neural Engine
in iPhones and Macs. Now Apple appears to be setting its sights on data-center AI
hardware as well. Apple doesn’t use NVIDIA in its products (it uses its own GPUs in
devices and rents NVIDIA GPUs via cloud for its AI research)​
appleinsider.com
. However, reports show Apple is ramping up R&D on server-class AI chips, even
partnering with Broadcom to design a custom AI processor for Siri and other cloud
services​
appleinsider.com

appleinsider.com
. This would be a significant development – Apple’s entry would add another powerful
player in the NPU arena, and it underscores that even the world’s largest company
sees value in owning AI chip design. Apple’s motivation, in part, comes from control
and efficiency (they succeeded in replacing Intel CPUs with in-house chips for Macs,
and could aim to do something analogous to replace reliance on third-party GPUs for
AI tasks)​
appleinsider.com

appleinsider.com
.
● Specialized Startups (Graphcore, Cerebras, Sambanova, Tenstorrent, etc.): A
vibrant startup scene is tackling AI acceleration with novel approaches:
○ Graphcore (UK) offers the Intelligence Processing Unit (IPU), an NPU
designed for high parallelism and memory-on-chip to excel at neural nets.
Graphcore has attracted large investments and had early partnership talks
with Microsoft.
○ Cerebras Systems took a unique route by building wafer-scale NPUs – their
latest Wafer-Scale Engine (WSE-3) is effectively an entire silicon wafer acting
as one giant chip with 850,000 cores. It boasts extraordinary specs: for
instance, compared to NVIDIA’s flagship H100 GPU, Cerebras WSE-3 has
52× more cores, 880× more on-chip memory, and 7,000× greater memory
bandwidth – albeit on a single huge chip that’s 57× larger in area than a
GPU​

techtarget.com

techtarget.com
. Cerebras is carving out a niche for ultra-large model training that runs on
one chip instead of thousands of GPUs, which can simplify scaling.
○ SambaNova focuses on reconfigurable dataflow architecture, effectively
allowing the chip to adapt its circuits to the structure of a neural network
graph for efficiency.
○ Tenstorrent (led by famed chip architect Jim Keller) is designing modular AI
chips and licensing IP cores, aiming to challenge both at the data center and
edge with RISC-V based designs​
techtarget.com

○ I must speak of another unique NPU start-up, only one in the world, Ambient
Scientific ( https://www.ambientscientific.ai/ )-Only NPU where Matrix
Convolution is done in Analog circuit boosting a huge energy saving over all
other NPU architecture. They are raising funds, interested parties can
contact them.
.

● These startups often target specific advantages – be it speed, energy efficiency, or
ease of scaling – to compete with NVIDIA. Some have reported impressive gains on
certain benchmarks. For example, Intel’s acquisition Habana Labs produced the
Gaudi series of AI chips; the latest Habana Gaudi3 (now under Intel) is claimed to
train AI models 1.5× faster than NVIDIA’s high-end H100 GPU while consuming less
power​
techtarget.com
. If such claims hold in real-world use, they put serious competitive pressure on
NVIDIA in the cloud training market.
● Edge and IoT Chipmakers (Qualcomm, MediaTek, Huawei, etc.): Outside of big
data centers, NPUs are already commonplace in smaller devices. Qualcomm’s
Snapdragon mobile processors include a Hexagon “AI engine” (an NPU/DSP) to
accelerate camera AI features, voice assistants, and AR applications on phones.
MediaTek and Samsung have similar NPU modules in their smartphone chips. By
2023, over 90% of smartphones offered AI features, and virtually all premium
smartphones had dedicated NPUs to handle tasks like facial recognition, image
enhancement, and augmented reality​
globenewswire.com
. This statistic highlights that NPUs have essentially become standard for on-device
AI due to their efficiency. In the automotive sector, Tesla famously developed its own
FSD (Full Self-Driving) computer chip – essentially an NPU – that performs 72 trillion
operations per second to power Autopilot and FSD features​
globenewswire.com
. Tesla’s in-house chip replaced an NVIDIA GPU-based solution in its cars, achieving
better performance per watt for neural network inference in driving scenarios.
Similarly, many new cars from other automakers use NPUs (whether NVIDIA’s Drive
chips or competitors like Mobileye’s EyeQ) for ADAS (advanced driver-assistance
systems). The edge NPU trend shows how specialized hardware is preferred when
energy and cost are constrained.

Adoption Trends: The market is clearly moving toward diversification of AI hardware.
Hyperscale data center owners (the Googles and Amazons) are vertically integrating,
building their own chips to optimize AI workloads and control costs​
. A GlobalData analysis in mid-2024 noted that Meta, Microsoft, Google, and Amazon were
all investing heavily in proprietary semiconductor development to reduce “soaring” spending
on NVIDIA GPUs for generative AI – a strategic move to cut dependency on NVIDIA​
lightreading.com
. The financial logic is straightforward: NVIDIA’s top-tier AI GPUs (like the A100 and H100)
are very expensive, often thousands to tens of thousands of dollars each, and guzzle
significant power. Companies running tens of thousands of such units have huge incentive to
develop alternatives that are more cost-efficient over the long run. We are already seeing
results: Amazon’s latest AI cloud instances using Trainium reportedly offer up to 50% better
price-performance for certain training jobs compared to GPU-based instances, attracting
cost-conscious customers.
Additionally, the geopolitical landscape is playing a role. U.S. export restrictions on high-end
NVIDIA GPUs to China have spurred Chinese tech giants (Alibaba, Baidu, Huawei, etc.) to
accelerate their own AI chip programs​

. For example, Alibaba’s T-Head unit and Baidu’s Kunlun chip are homegrown NPUs aimed
at competing with NVIDIA’s offerings in the Chinese market​
. Huawei’s Ascend AI processors similarly target cloud and edge AI workloads without
relying on foreign tech​
. As these domestic NPUs improve, Chinese companies will increasingly use them over
NVIDIA, altering the global market share mix.
It’s also notable that NPUs are penetrating new markets that GPUs never dominated. Tiny
NPUs on microcontrollers now enable AI in wearables and IoT sensors (where a GPU could
never fit). At the high end, exotic solutions like analog NPUs and optical AI chips are being
researched to push beyond the limitations of digital GPUs. All this suggests that NPUs and
other specialized accelerators will capture a growing slice of the AI silicon pie in coming
years.
Financial Impact: Implications of NPU Rise on NVIDIA
and Competitors
The rise of NPUs and custom AI chips carries significant financial implications for NVIDIA
and the broader semiconductor industry. NVIDIA’s current dominance of the AI chip market
translates to substantial revenue and profit, but any erosion of its market share or taming of
growth could alter its trajectory in the coming years. Here we analyze the impact in terms of
market share shifts, revenue projections, and strategic responses:

NVIDIA’s Market Share and Potential Erosion: NVIDIA today enjoys a near-monopoly in
cutting-edge AI accelerators. Estimates show that NVIDIA’s AI GPUs and accelerators
command between 70% and 95% of the market share for AI chips​
appleinsider.com
. This remarkable share means NVIDIA not only gets the bulk of AI hardware sales but also
pricing power – contributing to its high margins. However, industry forecasts suggest that
this dominance will be chipped away as specialized ASICs and NPUs gain adoption. One
projection is that by 2027, AI-specific ASICs (which include NPUs) will make up about 13%
of AI accelerator sales, and by 2030 they could reach ~15% of the market​
opentools.ai
. While GPUs (mostly NVIDIA’s) would still have ~85% share in that scenario, it does
indicate a material shift of billions of dollars of revenue towards other players. Even a 10-
15% loss in market share for NVIDIA could translate to significant opportunity cost, given the
rapidly growing AI hardware market. To put numbers in perspective, the AI chip market is
expected to more than double from 2024 to 2027, reaching well over $100 billion in annual
size​
. If NVIDIA were on track to capture, say, 80% of that but ends up with 60%, the difference
would be tens of billions in revenue by the end of the decade.
One immediate pressure point is the hyperscalers’ behavior. The large cloud providers
collectively account for a substantial portion of NVIDIA’s data center GPU sales. If each of
the major ones (AWS, Google, Meta, Microsoft, and potentially Apple for its own use) shifts
even a fraction of their AI workloads to in-house chips, NVIDIA’s growth could slow. For
example, every AWS customer that opts to train on a Trainium instance instead of NVIDIA
A100s means fewer NVIDIA chips sold to Amazon. Google’s extensive TPU deployment has
likely already reduced how many GPUs Google buys (GPUs are still used by Google for
some tasks, but TPU handles a lot of the inference for Google services). Meta in 2023 was
reportedly buying large volumes of NVIDIA GPUs for its AI expansion, but by 2025–2026
Meta’s own chip could start handling some workloads, reducing future orders. A tech
industry analysis noted that while custom ASICs from hyperscalers have struggled to attract
external buyers, the real threat to NVIDIA is these hyperscalers shifting their internal
demand away from NVIDIA over time​
techinvestments.io
. In other words, even if Google never sells TPUs to others, every TPU that Google deploys
is one less GPU that NVIDIA can sell to Google. This trend is not catastrophic to NVIDIA in
the near term (since demand far outstrips supply at the moment), but over a horizon of
several years, it could cap the upside and lead to a plateau in revenue from top cloud
customers. Financial analysts have started to factor this in: Bank of America noted that as
custom chips gain more traction at hyperscalers, it could become challenging for competitors
like AMD and even NVIDIA to meet aggressive growth expectations​
reuters.com
.
Impact on Competitors (AMD, Intel, others): NVIDIA’s main traditional competitor in GPUs
is AMD. AMD is attempting to gain share in the AI accelerator market with its MI300 series

GPU accelerators (and by leveraging its acquisition of Xilinx for adaptive AI solutions).
However, the same trend of NPUs threatens AMD’s prospects too. If customers move to in-
house ASICs, AMD doesn’t benefit – it loses potential share just as NVIDIA does. AMD
recently had a stock dip partly because its own AI chip sales were slow, and analysts
pointed out that hyperscalers favor either NVIDIA or their own chips, leaving AMD in a tough
spot​
reuters.com
. Intel, meanwhile, has invested in alternative architectures (like Gaudi NPUs and its
upcoming GPU lines). Should the NPU/ASIC trend accelerate, Intel could find a foothold by
manufacturing or designing those custom chips (Intel, for instance, might win business by
producing someone else’s NPU given its foundry aspirations). But Intel’s direct AI chip
products would face the same headwinds as NVIDIA’s if customers go custom. The broader
chip industry could see a fragmentation of the AI accelerator market: instead of two or three
big vendors sharing the pie, we might have a landscape where each major consumer of AI
compute has its own silicon. This is reminiscent of large tech firms designing their own CPUs
(like Apple’s M1/M2 or Amazon’s Graviton) – the suppliers (Intel, AMD in that case) lost
some volume, but those suppliers adapted by focusing on other markets or becoming
manufacturing partners.
Revenue and Margin Pressure: In the short term, NVIDIA is actually enjoying increased
demand and can sell every AI GPU it makes; the competitive threats are more about the
next 3-5 years. As NPUs and competitor GPUs (like AMD’s) become available, we expect
price competition to emerge. NVIDIA’s hefty margins (its data center GPUs often carry
70%+ gross margin) could face pressure if buyers have viable alternatives. Already, some
startups and researchers opt for cloud TPUs because they can be more cost-effective for
certain workloads compared to renting NVIDIA GPUs. If AMD’s MI300X or Intel’s Gaudi3 can
undercut NVIDIA’s price for similar performance, NVIDIA might be forced to adjust pricing or
offer added value (such as software, support, or bundle deals including its networking gear).
Moreover, the proliferation of AI chip startups increases the risk of technology leaps. If one
of these companies delivers a truly breakthrough NPU that is widely superior in efficiency, it
could force NVIDIA to respond (possibly by incorporating similar techniques or even by
acquiring the threat). NVIDIA has huge R&D resources – its annual R&D spending is in the
billions – and it’s likely already exploring NPU-like designs (for example, NVIDIA’s own
Tensor Cores added in its GPUs are essentially NPU-style matrix units inside the GPU). So
far, NVIDIA’s strategy has been to absorb the benefits of specialization into its GPU line,
thus keeping customers in-house. This strategy will likely continue (we might see future
NVIDIA products that blur the line between GPU and NPU, or specialized NVIDIA chips for
inference). Financially, that means NVIDIA could sustain revenue by selling new types of
chips, but the mix of products might shift (perhaps fewer giant expensive GPUs and more
lower-cost accelerators, depending on market demand).
Strategic Responses from NVIDIA: NVIDIA is not standing still in face of the NPU trend.
Recognizing the importance of a full-stack solution, NVIDIA has been expanding beyond just
chips – into software, services, and even systems. For example, NVIDIA offers AI software
frameworks (for healthcare, robotics, etc.) that run best on its hardware, increasing customer
lock-in. It also launched cloud services like NVIDIA GPU Cloud (NGC) and partnerships to

provide GPU rental as a service. If NPUs from others start making inroads, NVIDIA could
respond in a few ways:
1. Offering its own specialized chips – NVIDIA might introduce ASIC-like
accelerators for specific clients. (Notably, NVIDIA has a programmable NPU called
the Deep Learning Accelerator (DLA) embedded in some of its system-on-chips for
automotive and robotics, showing it is capable of NPU design.)
2. Emphasizing a heterogeneous platform – Much like how NVIDIA now sells the
DGX platform (which includes GPUs, CPUs, networking), they could integrate NPUs
or FPGAs into their platforms if that’s what customers need, essentially becoming a
one-stop AI systems provider.
3. Maintaining software supremacy – By ensuring that developing, optimizing, and
deploying AI models is always easiest on NVIDIA’s platform, they can blunt the
appeal of alternative hardware. The difficulty of programming new NPUs is a barrier
for competitors. NVIDIA’s CUDA and library ecosystem still gives it a strong
defensive position​
ibm.com

ibm.com
. NVIDIA will likely continue investing in software (e.g., CUDA advancements, AI
model optimization toolkits, etc.) so that even if a competitor’s chip is theoretically
faster, many customers might stick with NVIDIA for the productivity and support
benefits.
4. Exploring new markets – Just as Cisco shifted towards software, security, and
services after its hardware growth leveled off, NVIDIA could leverage its AI expertise
to move into new revenue streams. This could include AI cloud services (NVIDIA
might one day offer its own AI-as-a-service directly), or deeper moves into industries
like automotive (where it provides not just chips but full self-driving software stacks to
carmakers). Such diversification could mitigate the impact if pure chip sales slow.
Outlook – NVIDIA and “the Next Cisco” Question: Financially, could NVIDIA’s trajectory
mirror Cisco’s decline? In terms of stock dynamics, some parallels are evident – a meteoric
rise followed by the risk of overvaluation. If the AI fervor cools or hits an adoption wall,
NVIDIA’s stock could similarly correct as Cisco’s did in the early 2000s. Already, market
historians point out that Cisco’s P/E ratio and hype in 2000 were unsustainably high, and
they draw comparisons to NVIDIA’s valuation now​
hardingloevner.com

hardingloevner.com
. However, the actual business prospects may be more resilient for NVIDIA. AI demand is
poised to keep growing for many years (whereas the dot-com infrastructure build-out rapidly
peaked and leveled for a time). Even if NVIDIA’s share of the pie becomes smaller, the pie
itself (the AI hardware market) is exploding in size – from $73 billion in 2024 toward an
estimated several hundred billion within a decade​
coolest-gadgets.com
. So NVIDIA could still grow its revenues significantly even if NPUs claim a portion of the
market, just as Cisco continued to grow sales post-2000 but at a slower pace​

hardingloevner.com
. The key difference will be whether NVIDIA can retain a leadership position (technologically
and financially) or whether it becomes just one of many players in a more commodity-like
market. If AI accelerators become commoditized (as networking gear eventually did),
margins will shrink and NVIDIA’s growth will look much more Cisco-like. That scenario is
plausible in the long term: as AI matures, more standardized or open solutions could
emerge, reducing differentiation. Conversely, if NVIDIA stays at the cutting edge –
analogous to how Apple has managed to remain a leader in successive tech waves – it may
avoid the fate of Cisco and instead continue as a cornerstone of the AI era.
Conclusion
NVIDIA’s current dominance in AI hardware is unmistakable – it enjoys a Cisco-in-1999 level
of market leadership and acclaim. The rise of Neural Processing Units and custom AI chips,
however, presents both a challenge and an evolutionary force in the market. Historically,
Cisco’s climb and subsequent decline underscore how quickly technology leadership can be
upended when market conditions shift and competitors (or customers) seize on new
paradigms. There are clear parallels in NVIDIA’s situation: a booming market drawing new
entrants, potential overreliance on one product line, and customers seeking cheaper, tailored
solutions. NPUs are at the center of this potential shift – offering specialized efficiency that
could peel away portions of NVIDIA’s empire.
In technological terms, NPUs represent the next stage of optimization for AI workloads, and
their advantages in power and performance for neural networks are driving broad adoption
from edge devices to cloud data centers. The market landscape now features a growing
roster of NPU and AI accelerator players, from industry giants like Google, Amazon, and
Apple building in-house chips to startups pursuing radical new designs. The fact that nearly
every major tech firm is developing its own AI processor is a testament to how strategic AI
hardware has become – much as networking hardware was strategic during the internet
boom.
For NVIDIA and its competitors, the financial and strategic implications are profound. NVIDIA
could indeed see its growth temper as NPUs capture emerging demand, similar to how
Cisco saw its stratospheric rise cool off in the 2000s. Market share gains by NPUs and other
ASICs imply a more fragmented future, and NVIDIA will have to innovate relentlessly – not
just in chip performance, but in ecosystem and strategy – to retain its dominance. Unlike
Cisco, which arguably pivoted too slowly after its peak, NVIDIA has the opportunity to learn
from that history: to balance its short-term success with long-term adaptation. We may well
see NVIDIA itself introduce more specialized processors or double down on software-driven
differentiation to avoid the pitfalls of commoditization.
In summary, the NPU market is rapidly evolving and is set to reshape the AI hardware
landscape in the coming years. NVIDIA’s trajectory could mirror Cisco’s historical decline if
the company is caught flat-footed by this shift or if exuberant expectations outpace reality.
However, the story is not predestined. By understanding the parallels with Cisco but also
leveraging key differences (such as software ecosystems and a still-growing TAM for AI),
NVIDIA might navigate this transition and remain a central player. What is certain is that the
competition in AI hardware is intensifying: GPUs will no longer be the only option, and NPUs

are moving from niche to mainstream. For users and the industry, this competition is likely to
spur faster innovation and better efficiency, ensuring that the next chapter of AI technology –
much like the post-Cisco chapter of networking – will be even more impactful than the last.
Sources:
● Harding Loevner (2023) – “NVIDIA and the Cautionary Tale of Cisco Systems”​
hardingloevner.com

hardingloevner.com

hardingloevner.com
● IBM – “NPU vs GPU: What’s the Difference?”​
ibm.com

ibm.com
● Pure Storage Blog – “NPU vs. GPU”​
blog.purestorage.com

blog.purestorage.com
● SNS Insider (Globe Newswire, 2024) – “Neural Processor Market…Driven by AI”​
globenewswire.com

globenewswire.com
● LightReading (GlobalData, 2024) – “Hyperscalers forge ahead with in-house
semiconductor solutions…”​
lightreading.com

lightreading.com
● TechTarget (2024) – “10 top AI hardware and chip-making companies”​
techtarget.com

techtarget.com
● AppleInsider (2024) – “Apple’s bad blood with Nvidia…”​
appleinsider.com

appleinsider.com
● Broadcom AI News (2023) – “ASICs expected to constitute 15% of AI accelerator
sales by 2030.”​
opentools.ai
● Reuters (2023) – Analyst comment on custom chips traction and impact on
AMD/Nvidia​
reuters.com
● Harding Loevner (2023) – Cisco vs Nvidia stock and revenue history​
hardingloevner.com

hardingloevner.com

hardingloevner.com

Dr Biplab Pal, PhD

CEO, Robolytics LLC
Robolytics – Breakthrough Solutions for High-Risk Environments

Adjunct Professor, CARDS & IS, UMBC
 AI/IoT/Sensor Product & IP Consultant

Partner with Us

Stop losing money on avoidable risks and start harnessing AI to power your financial success. From startups to PE/VC-backed firms, we’re here to optimize your strategysecure your IP, and boost your bottom line.

Ready to take control of your financial future?

Pioneer Global Financial Management & Consulting (PGLFMC)
— Where Smart Risk Management Meets AI for Unmatched Financial Returns