Nvidia GTC 2024: Why Nvidia Dominates AI : US Pioneer Global VC DIFCHQ NYC India Singapore – Riyadh Norway Our Mind | Pioneer Global Finance Management Consulting

The AI GPU trend accelerated Nvidia’s market value and revenue, and it is now one of the three most valuable companies in the world based on market capitalization.

Nvidia’s Developer Conference has been a major event in the GPU industry for more than a decade. With the company’s GPUs having become the leading AI chips over the past two years, the 2024 Nvidia GPU Technology Conference (GTC) attracted not just the traditional, engineering-centric attendees but many Wall Street analysts as well.

The AI GPU trend accelerated Nvidia’s market value and revenue tremendously in the past two years, and it is now one of the three most valuable companies in the world based on market capitalization. Nvidia’s revenue grew from US$10 billion in 2018 to US$61 billion in 2023, a spike that explains the influx of Wall Street analysts at GTC 2024.

The growth of generative AI since the launch of ChatGPT has accelerated the importance of GTC presentations and product announcements. This article summarizes Nvidia’s key announcements at GTC 2024, offering a perspective that considers Nvidia’s history, notably the evolution of its GPU chip architecture and technology; the importance of Nvidia’s CUDA software technology; and why and how Nvidia’s GPUs became a key technology for the AI industry.

History of Nvidia’s GPU architecture

The microarchitectures that Nvidia designs for each GPU generation offer significant improvements with each new version. Multiple GPU chips are designed based on each microarchitecture over a period of a few years. Each GPU chip has functionality that improves performance for specific application segments—initially for PC graphics and game consoles and recently for autonomous-vehicle functions and AI-centric software and systems.

Thus far, the company has introduced a total of 16 GPU microarchitectures, all named after famous inventors and scientists across multiple disciplines. The first was the Fahrenheit architecture, released in 1995. Nvidia’s latest GPU microarchitecture, Blackwell, debuted in March 2024.

Table 1 shows how Nvidia’s GPUs have improved since 1995. The table includes all 16 GPU microarchitectures, with a focus on the latest versions. The first GPU microarchitecture used in the automotive industry, in 2015, was Maxwell, Nvidia’s ninth GPU architecture, released in 2014.

Evolution of Nvidia’s GPU microarchitecture

The first Fahrenheit-based GPU had 1 million transistors and was based on a 500-nm manufacturing process technology. By 1999, the technology enabled GPU chips with 15 million transistors.

Tesla, the sixth Nvidia GPU microarchitecture, was capable of 210 million transistors in 2006 and 1.4 billion transistors in 2010. During its four-year manufacturing run, the Tesla GPU chip shrank from 90 nm to 40 nm, even as the transistor count grew 6.7×.

The first Nvidia GPU microarchitecture to offer more than 1 billion transistors at introduction was Maxwell, which provided 2.9 billion transistors in 2014. Maxwell was also the first Nvidia GPU to be used in automotive applications. Each new generation added more computing power through computing accelerators for specific calculation in parallel or simultaneously, which increased the performance of GPU-based chips.

As AI became the driving force behind GPU improvements, the transistor count rose to more than 28 billion for the Ampere GPU in 2020 and to 80 billion for the Hopper GPU in 2022. More parallel and specialized computing functions are key to improving performance for AI software. The Hopper GPU microarchitecture had especially extensive improvements for accelerating AI computing. Hopper was the first GPU with a transformer engine for acceleration of AI training models.

In March 2024, Nvidia announced Blackwell as the latest GPU generation, with 208 billion transistors. Product shipment is due later this year. The 2024 Blackwell GPU chip will more than double the Hopper performance and transistor count. However, much of the transistor increase is due to the use of two chip dies connected by a 10-TB/s chip-to-chip interconnect in a unified single GPU. Each die has 104 billion transistors—a 30% increase over Hopper’s 80 billion transistors. The Blackwell architecture can have up to 576 GPU processors on a chip. Blackwell includes a second-generation transformer engine.

TSMC has manufactured the vast majority of Nvidia’s chips, though IBM and Samsung have manufactured some chips for Nvidia in the past 20 years. Samsung was a fab partner from 2016 to 2022. SGS-Thomson Microelectronics was an early Nvidia fab partner.

CUDA platform

GPUs were originally designed for image manipulation and the calculation of local image properties. The mathematical foundations of neural networks and image manipulation are similar, which created a big opportunity for GPUs as AI chips. In the mid-2010s, GPUs evolved to enable deep learning for training and inference in many applications, including autonomous vehicles.

Nvidia defined a GPU programming interface called CUDA (which stands for Compute Unified Device Architecture, though Nvidia uses the acronym only) and released the first version in mid-2007. CUDA version 12.4 was released in March 2024. CUDA is a parallel computing platform and application programming interface that allows software to use GPUs for accelerated general-purpose processing. Nvidia has used “accelerated computing” to define its product strategy for about a decade.

The CUDA API is an extension of the C programming language that adds the ability to specify software-level parallelism in C and to specify GPU-device–specific operations. CUDA is also a software layer that gives direct access to the GPU’s virtual instruction set and parallel computational elements. CUDA is designed to work with programming languages like C, C++, Fortran and Python. This accessibility makes it easier to do parallel programming via the many processors and accelerators provided in GPU chips. Many corporations program their AI applications in CUDA to get maximum performance from Nvidia GPU systems. This has built a broad CUDA software base and a large population of programmers with expertise in developing software for Nvidia’s GPUs.

Intel’s continued leadership in processor chips for the PC market is attributable to the software base that was built around IBM’s PC standard in the 1980s and 1990s. Intel has leveraged that software to maintain dominance in PC chips for more than 40 years, but it did not transfer this software advantage to later computer segments like smartphones and tablets.

Nvidia has a similar software base, built around CUDA, that has given the company a dominant share of the GPU market for over a decade. Nvidia retained its leadership during the graphics card era and through the video processing era as well. Now, the CUDA software base and parallel processing capabilities are the leaders for AI model development and AI application deployments. This AI software base advantage will persist for some time, even with increased competition in the future.

The question is whether Nvidia can extend its software base advantages to new AI applications and other new application segments as they emerge. Nvidia has embraced and transitioned to new CUDA opportunities before and is likely to do so again. Another question is whether the Unified Acceleration (UXL) Foundation will have an impact. The UXL Foundation was launched in September 2023 to develop a CUDA alternative and has an impressive membership roster, including Arm, Fujitsu, Google Cloud, Intel, Qualcomm, Samsung and others.

The formation of UXL looks like an acknowledgement that CUDA is a clear leader and that no single company can compete with CUDA’s software base and momentum. Only a foundation with major company participation can or may build a CUDA competitor.

How Nvidia became the leader in AI

We have already discussed two reasons for Nvidia’s rise to AI dominance: GPU technology advances for AI calculations and the CUDA software platform for writing code for parallel execution of many streams of information. The technology advances of GPUs have been very rapid—from 1 million transistors on a GPU in 1995 to 208 billion transistors in 2024. This functionality and performance growth was key to enabling the rapid expansion and complexity of AI models. There are at least two more factors: building momentum in graphics- and video-related GPU applications, and understanding and reacting to the potential of AI as a major future GPU market.

On all counts, Nvidia’s management gets credit for crafting and implementing a successful strategy.

Nvidia made a foresighted investment in the late 2000s to set up CUDA programming classes at most of the major universities worldwide—at Nvidia’s cost. The bold gambit paid off; it was part of the creation of a market concept and enabled the later growth of the CUDA software base.

Nvidia’s early realization that AI had great potential was critical. The company first focused on AVs as an AI opportunity and included GPU functionality for speeding up AI calculation. The AV market was delayed, but generative AI came along to take AVs’ place—and Nvidia was ready to ride this market growth. The rest is a great history for Nvidia.

Another key factor is that Nvidia became as much a software company as a chip design company. Today, Nvidia is more of a software company with highly skilled AI GPU chip designers. Nvidia has developed a large AI software base, including software development tools, AI libraries and AI foundational models for a large spectrum of AI applications.

Through its AI software development activity, Nvidia gained an understanding of what GPU hardware architecture and processing accelerators were needed for AI performance, especially for the AI training phase.

This knowledge base was reinforced and amplified by building AI systems for cloud system operators and similar customers. All of this has given Nvidia the expertise to rapidly improve its AI-focused GPUs and the systems for AI training and inferencing. In the last two GPU architecture upgrades, the focus was on improving AI performance for training and inferencing. AI application improvements will be the focus of future GPU chip architectures.

Apple is known for gaining major advantages from designing both its hardware and software into a better system experience for its users. Nvidia is reaping similar advantages from its activities in chip and hardware design and the CUDA software platform with additional AI software development.

Key announcements at GTC 2024

GTC 2024 was primarily a technical conference for GPU developers, as evidenced by its more than 900 technical sessions and over 20 workshops. More than 300 exhibitors showcased their hardware, software and services focused on GPU-based market opportunities.

The many announcements at GTC 2024 covered all of Nvidia’s product lines. Table 2 summarizes most of the announcements.

Blackwell GPU

Topping the list of announcements was the Blackwell GPU, which Nvidia claims is the most powerful processor chip available. Nvidia is positioning Blackwell as the next generation of accelerated computing and as the processor for the generative AI era. Blackwell has a second-generation transformer engine with support for FP4–FP6 data types in the Tensor cores. Most generative AI models need primarily low-precision calculations, and Blackwell performance is greatly improved under such conditions.

The new GPU architecture is named after David Harold Blackwell. A University of California, Berkeley mathematician specializing in game theory and statistics, he was the first Black scholar inducted into the National Academy of Sciences.

Blackwell GPUs include a dedicated engine for reliability, availability and serviceability (RAS). The RAS feature is especially important for automotive and other systems in which a failure can lead to a loss of life. Blackwell also adds chip-level capabilities for using AI-based preventative maintenance to run diagnostics and forecast reliability issues. This improves system uptime and resiliency for massive-scale AI deployments to run uninterrupted for weeks or months at a time and to reduce operating costs.

Nvidia said Blackwell is being adopted by every major global cloud services provider, pioneering AI companies, system and server vendors and regional cloud service providers. Nvidia believes Blackwell will be its most successful product launch in its history. The cloud players need high-performance systems and connections based on Blackwell. Some of these are summarized below.

DGX SuperPOD

Nvidia announced its next-generation AI supercomputer: the Nvidia DGX SuperPOD. It is powered by Nvidia DG GB200 Grace Blackwell superchips and can process trillion-parameter models for large-scale generative AI training and inference workloads. It provides 11.5 exaFLOPS using FP4 data types and 240 TB of fast memory. It can scale to higher performance with additional racks of DGX systems.

NVLink Switch chip

NVLink Switch chip is Nvidia’s high-speed network link for connecting multiple GPU processors and systems. It is a complex chip that has 50 billion transistors and is manufactured by TSMC using 4-nm design rules. Each NVLink Switch can connect four NVLink interconnects at 1.8 TB/s.

NVLink Switch and GB200 are key components for creating giant GPUs. The Nvidia GB200 NVL72 is a multi-node, liquid-cooled, rack-scale system that harnesses Blackwell to offer supercharged compute for trillion-parameter models with 720 petaFLOPS of AI training performance and 1.4 exaFLOPS of AI inference performance in a single rack.

NIM runtime software

Nvidia Inference Microservices (NIM) are secure software packages built from Nvidia’s accelerated computing libraries and generative AI models. The microservices support standard APIs to connect and work across Nvidia’s CUDA installed base. They are re-optimized for new GPUs and are scanned for security vulnerabilities and exposures.

Nvidia is launching a new type of NIM-based biosystem software. The company rolled out more than two dozen microservices that will allow healthcare enterprises to leverage the latest advances in
generative AI.

Omniverse Cloud API

The main goal of Omniverse is to bring AI to the physical world. Nvidia announced that Omniverse Cloud will be available as APIs to extend the reach of the Omniverse platform for creating industrial digital twin applications and workflows across many software ecosystems.

The five new Omniverse Cloud APIs enable developers to integrate core Omniverse technologies into existing design and automation software applications for digital twins. This provides simulation workflows for testing and validating autonomous machines like robots and self-driving vehicles.

6G Research Cloud

In telecom, Nvidia announced the Nvidia 6G Research Cloud, a generative AI and Omniverse-powered platform to advance the next communications era. It’s built with Nvidia’s Sionna neural radio framework, the Nvidia Aerial CUDA-accelerated radio access network and the Nvidia Aerial Omniverse Digital Twin for 6G. This offering should help telecom developers test and simulate the many options for defining the technologies and features of 6G.

Weather prediction

Nvidia announced the availability of its Earth Climate Digital Twin, a cloud platform that enables interactive, high-resolution simulation to accelerate climate and weather predictions at a 2-km scale.

Nvidia also announced new Earth-2 cloud APIs on Nvidia DGX Cloud. These will allow users to create AI-powered emulations to speed the delivery of interactive, high-resolution simulations of phenomena ranging from global atmospheric conditions and local cloud cover to storms and other events. Earth-2’s APIs offer AI models and a new Nvidia generative AI model, called CorrDiff, using state-of-the-art diffusion modeling to generate 12.5× higher-resolution images than are possible with current numerical models. Compared with the current models, the Earth-2 API models are 1,000× faster and 3,000× more energy-efficient.

Blackwell Thor

For the automotive industry, Drive Thor was the key announcement at GTC 2024. Drive Thor incorporates the Blackwell GPU architecture, which is designed for transformer and generative AI applications. These capabilities will be important for future AV and ADAS functionality.

There is also a Jetson Thor version, which targets robotics applications and is also likely to be used in some automotive segments.

Nvidia Blackwell platform. — *This image, part of Nvidia CEO Jensen Huang’s keynote presentation, shows several Blackwell-based products. (Source: Nvidia)*

Summary

Nvidia made more announcements at GTC 2024 than at any other of its conferences. The Blackwell GPU is the most important of those introductions, as there will be many new products—some announced at the conference and others likely to appear next year.

Nvidia’s GTC 2024 was an important event that shows Nvidia’s strategy and product direction for the next couple of years. It looks like Nvidia will continue its domination in GPU-based hardware and CUDA-based software for existing AI software segments and use cases over that time period. Nvidia will face more competition and will lose some market share, albeit of a much larger pie. Expect Nvidia to retain its AI leadership for a long time to come as it leverages its synergistic chip-hardware-software products and services.

https://www.eetimes.eu/nvidia-gtc-2024-why-nvidia-dominates-ai/

Nvidia GTC 2024: Why Nvidia Dominates AI : US Pioneer Global VC DIFCHQ NYC India Singapore – Riyadh Norway Our Mind

Nvidia GTC 2024: Why Nvidia Dominates AI : US Pioneer Global VC DIFCHQ NYC India Singapore – Riyadh Norway Our Mind

History of Nvidia’s GPU architecture

CUDA platform

How Nvidia became the leader in AI

Key announcements at GTC 2024

Blackwell GPU

DGX SuperPOD

NVLink Switch chip

NIM runtime software

Omniverse Cloud API

6G Research Cloud

Weather prediction

Blackwell Thor

Summary

Useful Links

Contact Us

Nvidia GTC 2024: Why Nvidia Dominates AI : US Pioneer Global VC DIFCHQ NYC India Singapore – Riyadh Norway Our Mind

History of Nvidia’s GPU architecture

CUDA platform

How Nvidia became the leader in AI

Key announcements at GTC 2024

Blackwell GPU

DGX SuperPOD

NVLink Switch chip

NIM runtime software

Omniverse Cloud API

6G Research Cloud

Weather prediction

Blackwell Thor

Summary

Useful Links

Contact Us

Newsletter Signup