Up To 50 PFLOPs With HBM4, Vera CPU With 88 Olympus Cores, And Delivers 5x Uplift Vs Blackwell

NVIDIA is formally announcing its Rubin AI platform today, which will be the heart of next-gen Data Centers, with a 5x upgrade over Blackwell.

NVIDIA To Dominate The AI Markets With Its Rubin Platform: Six Chips, One 50 PFLOPs & HBM4-Powered GPU, Vera CPU With 88 Olympus Cores & Impressive Uplifts Versus Blackwell, Now In Full Production

Today, NVIDIA is officially announcing its Rubin platform, which comes as a surprise because we were all expecting an update at the company’s GTC event, which has already been announced. With the exciting developments in the AI segment and with all the AI talk going around at CES, NVIDIA decided to unveil its grand AI platform a little early.

NVIDIA’s Rubin platform is going to be made up of a total of six chips, all of which are back from fabs and in NVIDIA’s labs for testing. These chips include:

Rubin GPU (with 336 Billion Transistors)
Vera CPU (with 227 Billion Transistors)
NVLINK 6 Switch for Interconnect
CX9 & BF4 for Networking
Spectrum-X 102.4T CPO for silicon photonics

All of these chips combined make the Rubin platform alive inside a range of DGX, HGX, and MGX systems. At the heart of each data center is the NVIDIA Vera Rubin Superchip, featuring two Rubin GPUs, one Vera CPU, and massive amounts of memory in HBM4 and LPDDR5x configurations. The highlights of the NVIDIA Rubin technology include:

6th Gen NVLink (3.6 TB/s Scale-Up)
Vera CPU (Custom Olympus Core)
Rubin CPU (50 PF NVFP4 Transformer Engine)
3rd Gen Confidential Computing (First Rack-Scale TEE)
2nd Gen RAS Engine (Zero Downtime Health Checks)

So starting with the Rubin GPU, this chip features two reticle dies, each with lots of compute and tensor cores. The chip itself is designed purely for AI-intensive workloads, offering 50 FLOPs of NVFP4 Inference, 35 PFLOPs of NVFP4 Training performance, a 5x and 3.5x increase over Blackwell, respectively. The chip is also equipped with HBM4 memory, offering up to 22 TB/s bandwidth per chip, a 2.8x increase vs Blackwell and 3.6 TB/s of NVLink bandwidth per CPU, a 2x increase vs Blackwell.

For the Vera CPU, NVIDIA has designed its next-gen custom Arm architecture codenamed Olympus, and the chip packs 88 cores, 176 threads (with NVIDIA Spatial Multi-Threading), 1.8 TB/s NVLink-C2C coherent memory interconnect, 1.5 TB of system memory (3x Grace), 1.2 TB/s of memory bandwidth with SOCAMM LPDDR5X, and Rack-scale confidential compute. These combine to offer 2x data processing, compression & CI/CD performance versus Grace.

NVLink 6 switches offer networking fabric on the Rubin platform with 400G SerDes, 3.6 TB/s per-CPU all-to-all bandwidth, 28.8 TB/s of total bandwidth, 14.4 TFLOPS of FP8 compute in-network, & a 100% liquid cooled design.

Networking is powered by the latest ConnectX-9 and BlueField-4 modules. ConnectX-9 SuperNIC offers 1.6 TB/s bandwidth with 200G PAM4 SerDes, programmable RDMA and data path accelerator, top-level security, and is optimized for massive-scale AI.

The Bluefield-4 is an 800G DPU for SmartNIC and storage processor. It integrates a 64-core Grace CPU with ConnectX-9, offers 2x networking capabilities versus BlueField-3, 6x compute, and 3x memory bandwidth.

All of these come together in the NVIDIA Vera Rubin NVL72 rack, which offers some impressive uplifts versus Blackwell as detailed below:

5x NVFP4 Inference (3.6 EFLOPS)
3.5x NVFP4 Training (2.5 EFLOPS)
2.5x LPDDR5x Capacity (54 TB)
1.5x HBM4 Capacity (20.7 TB)
2.8x HBM4 Bandwidth (1.6 PB/s)
2x Scale-Up Bandwidth (260 TB/s)

NVIDIA is also announcing its Spectrum-X Ethernet Co-Packaged Optics solution, which offers a 102.4 Tb/s scale-out switch infrastructure, co-packaged 200G silicon photonics, and offers 95% of effective bandwidth at scale. The system is 5 times more efficient, 10 times more reliable, and offers 5 times higher application runtime.

For its Rubin SuperPOD, NVIDIA is also unveiling the Inference Context Memory Storage platform, which is built for gigascale inference and is fully integrated with NVIDIA software solutions such as Dynamo, NIXL & DOCA.

To wrap it all up, NVIDIA will be putting its Rubin platform in its bleeding-edge DGX SuperPOD with 8 Vera Rubin NVL72 racks. But that isn’t it, there’s also the NVIDIA DGX Rubin NVL8 for mainstream Data Centers.

With all of these advancements, NVIDIA Rubin offers 10x reduction in inference token cost and 4x reduction in number of GPUs to train MoE models vs Blackwell GB200. The Rubin ecosystem is backed by a diverse range of partners and is in full production, with customers getting the first chips later this year.

Follow Wccftech on Google to get more of our news coverage in your feeds.

https://wccftech.com/nvidia-rubin-most-advanced-ai-platform-50-pflops-vera-cpu-5x-uplift-vs-blackwell/

NVIDIA Rubin Is The Most Advanced AI Platform On The Planet: US Pioneer

NVIDIA Rubin Is The Most Advanced AI Platform On The Planet: US Pioneer

Up To 50 PFLOPs With HBM4, Vera CPU With 88 Olympus Cores, And Delivers 5x Uplift Vs Blackwell

NVIDIA To Dominate The AI Markets With Its Rubin Platform: Six Chips, One 50 PFLOPs & HBM4-Powered GPU, Vera CPU With 88 Olympus Cores & Impressive Uplifts Versus Blackwell, Now In Full Production

Useful Links

Contact Us

NVIDIA Rubin Is The Most Advanced AI Platform On The Planet: US Pioneer

Up To 50 PFLOPs With HBM4, Vera CPU With 88 Olympus Cores, And Delivers 5x Uplift Vs Blackwell

NVIDIA To Dominate The AI Markets With Its Rubin Platform: Six Chips, One 50 PFLOPs & HBM4-Powered GPU, Vera CPU With 88 Olympus Cores & Impressive Uplifts Versus Blackwell, Now In Full Production

Related Story SK hynix Delivers Next-Gen Memory Solutions: 48 GB HBM4 at 11.7 Gbps Speeds, SOCAMM2 & LPDDR6 For Upcoming AI Platforms

Useful Links

Contact Us

Newsletter Signup