An incomplete list of the compute, storage, and networking stories everyone’s been talking about in 2024
As the AI bubble continues to inflate, there’s been no shortage of news to report on in the world of compute, storage, and networking in 2024.
From startup funding announcements to quantum computing deployments, government sanctions, new hardware releases, and AI compute clusters on a scale we’ve never seen before, trying to round up all the headlines from the last 12 months is a feat that may challenge even the most powerful supercomputer.
That being said, there have been a few stand-out stories from the year that we can expect to continue well into 2025 and perhaps beyond. So, in case you missed them the first time around, here are some selected highlights (or lowlights) from the year.
Intel’s annus horribilis
Of the many companies that were covered by DCD this year, arguably none have had a drama-filled 12 months quite like Intel.
For the chipmaker, 2024 started with the announcement of an ambitious plan to expand its global chip fabrication footprint and carve out its Product and Foundry lines into two separate businesses, and ended with the chipmaker laying off 15,000 employees, stalling construction on its proposed fabs, and announcing the ’retirement’ of its CEO, Pat Gelsinger.
Following successive quarters where the company reported billions of dollars in losses, Gelsinger’s departure was perhaps an inevitability but has left many in the industry wondering what’s next for Intel.
Gelsinger first joined the company as a teenage technician back in 1979, leaving in 2009 for two executive stints at EMC and VMware, before rejoining Intel as CEO in 2021.
Upon his appointment as Intel’s chief executive, he promised to return the company to its former glories with a five-year strategy he dubbed IDM (integrated device manufacturing), which would see the company manufacturing its own cutting-edge chips as well as supplying components to third parties.
However, while the ambition was admirable, the execution did not match. Intel has been criticized for missing the boat regarding the AI boom, while its Gaudi 3 accelerator has been described as difficult to use. Software problems also meant the company failed to reach its target of $500 million in Gaudi 3 sales for the year.
For now, Intel’s future remains uncertain – although construction at many of its fabs outside the US remains stalled, the company was able to successfully negotiate an (albeit reduced) funding agreement under the US CHIPS and Science Act.
That being said, speaking at the Barclays technology conference in San Francisco on December 12, the company’s interim co-CEO Dave Zinsner said no decision had yet been made regarding a formal separation of the company’s factory and product development divisions, while its other leader, MJ Holthaus, told attendees that moving forward, Intel would be focused on developing more generic accelerator offerings that would make the company more competitive.
Whatever happens, Holthaus said Intel will remain focused on “…building world-class products and a world-class foundry, we’re still highly invested in doing that and those two things together will help differentiate us in the marketplace.”
She said: “At the end of the day, if we build products that allow our customers to win, we’ll win.”
Will anyone be able to usurp Nvidia?
Where Intel has struggled, Nvidia has continued to soar, unveiling Blackwell, its most powerful GPU offering to date, in March before reaching a $3 trillion market cap in June and twice overtaking Apple to become the world’s most valuable company.
However, while Nvidia continues to have a stranglehold on the market – the waiting list for Blackwell hardware is now pushing 12 months, with new orders unlikely to be filled until late 2025 – one might start to wonder if the company that has driven much of the AI boom might have started to fly a bit too close to the sun.
At launch, the news that its B100, B200, and GB200 offerings would be liquid-cooled and operate at between 700W and 1,200W certainly got people talking.
However, Blackwell has been plagued with issues. First, an unexpected production error reported in August saw Nvidia announce it would be pushing deliveries back to early 2025, then reports started to emerge that the AI processors were overheating when linked together in 72-chip data center racks – the GB200 NVL72 configuration is capable of running 72 GB200 GPUs, 36 Grace CPUs, and nine NVLink switch trays, each of which has two NVLink switches.
Despite the challenges with Blackwell, Nvidia has already announced the product’s successor – Rubin – and earlier this year, Nvidia CEO Jensen Huang said that the company’s updated roadmap will see it launch a new product family every year.
While a shiny new product launch might be what’s keeping Nvidia occupied right now, it should also be noted that hot GB200s might not be the only fires the company is fighting.
In August, the US Department of Justice (DOJ) launched two separate antitrust probes into the GPU giant, evaluating whether the company has abused its market dominance and forced companies to buy additional products to receive GPUs while penalizing those that buy rival chips.
The DOJ is also reportedly looking into the company’s $700 million acquisition of Run:ai in April 2024 and its 2022 purchase of software firm Bright Computing.
Nvidia’s French offices were raided in 2023 and it’s believed the French Autorité de la concurrence (Competition Authority) is considering raising anti-competition charges against the company. The UK and EU are looking into AI competition risks more broadly. That being said, it’s unlikely Nvidia will be toppled any time soon however, attention should be paid to its competitors in 2025.
AMD has had a very successful 2024, posting record data center revenues of $3.5 billion for Q3 2024. CEO Dr. Lisa Su told analysts on a revenue call that the company’s MI325X will “compete very well with H200 and the MI350 series will compete very well with Blackwell,” adding that it should also be assumed that the company was “working with all the large customers out there.”
And who knows, maybe Intel will make that comeback after all…
xAI brings whole new meaning to supercomputer
This year saw Elon Musk, the world’s richest man, became the keeper of the world’s largest GPU cluster – the 100,000-strong Colossus supercomputer that has made its home amongst many a disgruntled local in Memphis, Tennessee.
And the good news for all those unhappy campers is that in October, Musk announced plans to first double its compute capacity, a figure that was then superseded in December when the city’s Chamber of Commerce claimed xAI intends to expand the size of its Colossus supercomputer to one million GPUs.
Currently, the 750,000 sq ft (69,677 sqm) former Electrolux plant houses 100,000 liquid-cooled Nvidia H100 GPUs and relies on the Nvidia Spectrum-X Ethernet networking platform for its Remote Direct Memory Access (RDMA) network. Servers have been provided by Supermicro and Dell.
xAI claims the Colossus supercomputer, which is being used to train and run the company’s AI chatbot Grok, was assembled in 122 days, with Musk announcing the cluster had gone live on July 22, 2024.
While the arrival of xAI has been hailed by business leaders in Memphis as the largest multi-billion dollar investment in the city’s history, campaigners have voiced concerns about the amount of power granted to the facility by grid operator Tennessee Valley Authority, as well as its impact on air quality in the city.
Will the CHIPS Act survive a Trump presidency?
President Biden turned Oprah Winfrey this year, handing out preliminary agreements with numerous semiconductor companies in the hopes of shoring up the US’ domestic semiconductor industry.
The $280 billion CHIPS and Science Act was approved by Congress in July 2022, with $52 billion of the overall funding package designated as subsidies for US semiconductor manufacturers. Funding from the act has also been earmarked for semiconductor R&D, growing a skilled semiconductor workforce, and incentives for the manufacturing of semiconductors and specialized tooling equipment.
Billions of dollars under the CHIPS and Science Act have been preliminary allocated to companies this year, with recipients including GlobalFoundries, Intel, TSMC, Samsung Electronics, Micron, SK Hynix, and GlobalWafers.
However, despite Biden’s attempts to secure these agreements in the dying days of his administration, the future of the CHIPS Act is now uncertain as the US, and the world, gears up for a second Trump presidency.
In the run-up to the election, President-elect Trump criticized the CHIPS and Science Act, saying that the government should have levied tariffs on the semiconductor industry instead of handing out grants and loans to chip companies.
Prior to the November election, House Speaker Mike Johnson also said the Republican party “probably will” try to repeal the US CHIPS Act, a statement he later walked back on, claiming he meant to say the party would instead “further streamline and improve the primary purpose of the bill.”
Trump has also previously said that Taiwan should pay the US to protect it from China and accused the country of taking “all [its] chip business.”
While the CHIPS Act was designed to future-proof the United States semiconductor industry, the US currently buys 92 percent of its leading-edge chips from TSMC in Taiwan, meaning any disruption to that supply chain would have a significant impact on the US economy and data center market.
https://www.datacenterdynamics.com/en/analysis/nvidia-wins-intel-loses-and-xai-becomes-a-colossus-a-year-in-compute/