AMD (AMD, Financials) officially launched its Instinct MI355X GPU accelerator Wednesday, showcasing a massive leap in compute power and energy demands as it competes with Nvidia's Blackwell Ultra B300.
The MI355X is part of AMD's new CDNA 4 architecture and is optimized for AI inference. With support for FP4, FP6, FP8, and FP16 precision, the MI355X delivers up to 20.1 PFLOPS in FP4/FP6 workloads and 10.1 PFLOPS in FP8, slightly ahead of Nvidia's B300 at 15 FP4 PFLOPS.
To support this performance, the MI355X consumes 1,400W peak, nearly doubling the 750W required by its predecessor, the MI300X. AMD expects some users may still air-cool the chip, but liquid cooling is the standard.
The GPU includes 288 GB of HBM3E memory with bandwidth reaching 8 TB/s. A scaled 8-way configuration brings system-level performance to 161 PFLOPS (FP4) and 80.5 PFLOPS.
While raw compute marks a win on paper, AMD still trails Nvidia in deployment scale and software ecosystem. Pegatron is reportedly preparing a 128-way MI350X system, but Nvidia remains dominant in large-scale AI training clusters.
AMD's Chief Technology Officer Mark Papermaster said zettascale supercomputing by 2035 will require processors consuming up to 2,000W each. He projected that future AI systems may need nuclear-scale power—up to 500 MW per machine.