NVIDIA’s GB200 NVL72 Racks Deliver a Staggering 28× Performance Uplift Over AMD’s MI355X : US Pioneer Global VC DIFCHQ SFO NYC Singapore – Riyadh Swiss Our Mind

NVIDIA’s Blackwell GB200 NVL72 AI racks have been tested in an MoE (Mixture of Experts) environment, and based on a report, they manage to outperform AMD’s Instinct MI355X by a huge margin.

NVIDIA’s “Extreme Co-Design” Laws Give the Company an Upper Hand In MoE Architectures, Widening the Gap with AMD

AI models are shifting rapidly towards an MoE-focused landscape, mainly since it allows for a much more efficient utilization of compute resources; however, scaling them up introduces a massive computing bottleneck compared to dense models. Since the MoE focuses on operating separate sub-networks labeled as ‘experts’, it requires tremendous all-to-all communication and data transfer between nodes, which induces latency issues and bandwidth pressure. Hyperscalers are seeking the best performance-per-dollar solution available, and according to an analysis by Signal65, NVIDIA’s GB200 NVL72 is the go-to option for MoE architectures.

Related Story Hot Take: The True AI Chip Challenge for NVIDIA Isn’t from AMD or Intel — It’s Google’s TPUs Heating Up the Race

Quoting benchmarks from SemiAnalysis’s InferenceMAX, the report mentions that NVIDIA’s Blackwell AI servers have brought in 28 times higher throughput per GPU (75 tokens/sec), compared to AMD’s MI355X in a similar cluster configuration, and if you are curious about why the performance difference is so significant, well, NVIDIA has answered this earlier. To address the performance bottlenecks involved in scaling MoE AI models, NVIDIA has employed the ‘co-design’ approach, which consists in utilizing the 72-chip configuration with the GB200, coupled with 30TB of fast shared memory. This enables NVIDIA to take expert parallelism to a whole new level.

Interestingly, AI economics deal with the fact that which architecture manages to provide better TCO figures, and according to Signal65, quoting data from Oracle’s Cloud pricing, NVIDIA’s GB200 NVL72 racks offering a whopping 1/15th relative cost per token, at a higher interactivity rate, which justifies one of the reasons on why NVIDIA’s hardware stack is one of the most adopted ones out there. For a company like NVIDIA, which operates on an annual product cadence, it manages to dominate every new AI frontier that opens up (inference, prefill, decode, and more), allowing it to maintain a lead.

Of course, these figures aren’t a comprehensive representation of the AMD vs. NVIDIA debate in the AI space, given that Team Red has yet to introduce a newer generation of rack-scale offerings. The MI355X Instinct offerings are known to be an aggressive option in highly dense environments, thanks to their high HBM3e capacity. However, when it comes to MoE alone, NVIDIA currently dominates. And, with future rack-scale solutions (Helios vs Vera Rubin), the competition will only increase.

https://wccftech.com/nvidia-gb200-nvl72-racks-deliver-a-28x-performance-uplift-over-amd-mi355x/amp/