AMD announces 288 GB Instinct MI325X GPU in challenge to Nvidia

Will enable servers to handle a one-trillion-parameter AI model in its entirety

Image:

AMD announces 288 GB Instinct MI325X GPU in challenge to Nvidia

AMD plans to release a new Instinct datacentre GPU later this year with significantly greater high-bandwidth memory than its MI300X chip or Nvidia’s H200, enabling servers to handle larger generative AI models than before.

At Computex 2024 in Taiwan on Monday, AMD was expected to reveal the Instinct MI325X GPU. Set to arrive in the fourth quarter, it will provide a substantial upgrade in memory capacity and bandwidth over the MI300X, which became one of AMD's "fastest-ramping" products to date after launching in December.

Whereas the MI300X sports 192 GB of HBM3 high-bandwidth memory and a memory bandwidth of 5.3 TBps, the MI325X features up to 288 GB of HBM3e and 6 TBps of bandwidth, according to AMD. Eight of these GPUs will fit into what's called the Instinct MI325X Platform, which has the same architecture as the MI300X platform that goes into servers designed by OEMs.

The chip designer said the MI325X has multiple advantages over Nvidia's H200, which was expected to start shipping in the second quarter as the successor to the H100.

For one, the MI325X's 288-GB capacity is more than double the H200's 141 GB of HBM3e, and its memory bandwidth is 30% faster than the H200's 4.8 TBps, according to AMD.

The company said the MI325X's peak theoretical throughput for 8-bit floating point (FP8) and 16-bit floating point (FP16) are 2.6 petaflops and 1.3 petaflops, respectively. These figures are 30% higher than what the H200 can accomplish, AMD said.

In addition, the MI325X enables servers to handle a one-trillion-parameter model in its entirety, double the size of what's possible with the H200, according to the company.

AMD's road map: new datacentre GPU every year

AMD announced the details as part of a newly disclosed plan to release a new datacentre GPU every year starting with the MI325X, which, like the MI300X, uses the company's CDNA 3 architecture that is expressly designed for datacentre applications.

In an extended road map, AMD said it will release the MI325X later this year. It will then release in 2025 a next-generation Instinct GPU that will use its CDNA 4 architecture to provide increased compute performance and "memory leadership," according to AMD. A follow-up GPU using a next-generation CDNA architecture will follow in 2026.

Like the MI325X, the next-generation Instinct GPU with CDNA 4 architecture coming next year will come with 288 GB of HBM3e. The chip will be manufactured using a 3-nanometer process—a substantial shrink in transistors from the 5nm and 6nm nodes used for MI300 chips—and add support for 6-bit floating point and 4-bit floating point data formats.

Andrew Dieckmann, head of AMD's datacentre GPU business, said the chip designer's datacentre GPU efforts have already gained support from multiple OEMs and cloud service providers, including Dell Technologies, Lenovo, Hewlett Packard Enterprise, Microsoft and Oracle. Another significant supporter is Facebook parent company Meta.

He also pointed out that AMD has built a solid foundation of support for popular generative AI models like OpenAI's GPT-4, Meta's Llama 3 and Mistral AI with the MI300X. The company has also demonstrated its commitment to open source innovation with its ROCm software, which supports more than 700,000 models hosted on Hugging Face, frameworks and libraries like PyTorch and TensorFlow, and OpenAI's Triton programming language.

"We're not resting on our laurels with the MI300X, and [we're] continuing to push the innovation forward at what we believe will be a very competitive pace and allow us to keep a leadership position in some of the key metrics that we've been able to establish with the MI300X product," Dieckmann said in a briefing with journalists.

AMD going against Nvidia's Blackwell GPUs

While AMD focused on the H200 for its competitive comparisons with the MI325X, the company will have to contend with the fact that Nvidia plans to release a more powerful generation of datacentre GPUs using the new Blackwell architecture later this year.

Nvidia plans to release the Blackwell-based GPUs as part of a new strategy announced last year to release accelerator chips every two years instead of once a year.

Despite Nvidia's accelerated road map plans, Dieckmann said the company feels good about "having a strong competitive position" against those products between the MI325X and the CDNA 4-based Instinct GPU that will follow in 2025.

"There's a bit of an interplay between the timing of our road map and their road map, but CDNA 4, it's a significant move forward in all dimensions of our competitiveness," he said.

This article first appeared on CRN.