Written by: Chris Raymond, AI GPU Product Marketing Manager, AMD
Contributions made by: Miro Hodak, Principal Member of Technical Staff, AI Performance Engineering, AMD
This is a sponsored article brought to you by AMD.
The latest MLPerf™ 5.1 Training results are also an important milestone — the first MLPerf training submission using AMD Instinct™ MI350 Series graphics processing units (GPUs). These new benchmarks demonstrate innovative performance gains and broad ecosystem participation across some of today’s most demanding AI training workloads.
This submission also represents the first time the AMD Instinct MI350 Series — including both the MI355X and MI350X GPUs — have been publicly benchmarked for AI training. And the recent results show clear progress in scalability, efficiency and compute performance, underscoring how the AMD Instinct MI350 Series is accelerating the development of next-generation AI models.
Up to 2.8X performance gain for AI training.
The new AMD Instinct™ MI350 Series GPUs delivers breakthrough performance, achieving up to 2.8X faster time-to-train compared to the AMD Instinct™ MI300X, and 2.1X faster compared to the Instinct MI325X platform.
These gains reflect architectural improvements, HBM3E memory bandwidth leadership and AMD ROCm™ 7.1 software optimizations that enhance kernel performance and communication efficiency. Together, they enable faster model finetuning and improved energy efficiency across large-scale generative AI workloads.
Competitive performance across industry benchmarks.
The AMD Instinct™ MI355X platform also delivers competitive training performance across leading generative AI workloads when compared to an average of competitor partner submissions in the MLPerf™ 5.1 Training round. On the Llama 2-70B LoRA (FP8) benchmark, the MI355X completed training in 10.18 minutes, closely matching the competing performance systems.
AMD continues to focus optimization efforts on FP8 training, the datatype most widely adopted by customers today and best suited for large-scale, high-accuracy model training while working on FP4 algorithmic development to make it usable in real life scenarios in near future.
Record-setting ecosystem participation.
The latest MLPerf™ 5.1 Training round also demonstrated record-level ecosystem participation across the AMD Instinct™ platform family — including the MI300X, MI325X, MI350X and MI355X GPUs.
In total, nine key partners submitted training results on AMD Instinct hardware, marking the broadest industry engagement to date for AMD in MLPerf Training. What makes this achievement especially noteworthy is that every partner submission represented their first time submitting on the new MI355X platform — yet all results landed within just one percent of AMD’s own submissions on the same benchmarks.
AMD ROCm™ 7.1 software: for high-performance, scalable and efficient AI training.
AMD ROCm™ 7.1 is the software engine behind all MLPerf™ 5.1 training submissions on AMD Instinct™ GPUs, enabling the high performance, scalability and efficiency seen across every AMD-based result. This latest ROCm release delivers end-to-end advancements across the stack — from kernel and compiler optimizations to communication efficiency and framework integration — designed to accelerate real-world workloads and improve scalability across multi-node systems.
The consistency and performance achieved across AMD Instinct™ MI355X partner submissions all trace back to AMD ROCm software. This release demonstrates that software innovation is every bit as critical as silicon performance. Together, they form the foundation of efficient, scalable and production-ready AI training.
Final takeaway: advancing AI training leadership on an annual cadence.
The MLPerf™ 5.1 Training results mark a defining moment for the AMD Instinct™ MI350 Series, showcasing breakthrough generational performance, strong competitive positioning and record ecosystem participation — all powered by the open and rapidly evolving ROCm™ 7.1 software platform.
Behind these results is a deliberate and steady innovation rhythm. The AMD Instinct roadmap continues to advance on an annual cadence — from the MI300X in 2023 to the MI325X in 2024, and now to the MI350 Series in 2025 — delivering new levels of compute density, memory bandwidth and software optimization with each generation. Looking ahead, the MI450 Series and next-generation CDNA™ architecture are already positioned to extend this momentum into 2026 and beyond.
To learn more about AMD’s training results and performance, read the full blog post on the AMD blog or contact the TD SYNNEX AMD team at AMDBranded@tdsynnex.com.
1. https://mlcommons.org/benchmarks/training/.
2. MI350-021 — Calculations by AMD Performance Labs in May 2025, based on the published memory capacity specifications of AMD Instinct™ MI350X/MI355X OAM 8xGPU platform versus an NVIDIA Blackwell B200 8xGPU platform. Server manufacturers may vary configurations, yielding different results. MI350-021.
3. MI350-012 — Based on calculations by AMD as of April 17, 2025, using the published memory specifications of the AMD Instinct MI350X/MI355X GPUs (288GB) versus MI300X (192GB) versus MI325X (256GB). Calculations performed with FP16 precision datatype at two bytes per parameter to determine the minimum number of GPUs (based on memory size) required to run the following large language models (LLMs): OPT (130B parameters), GPT-3 (175B parameters), BLOOM (176B parameters), Gopher (280B parameters), PaLM 1 (340B parameters), Generic LM (420B, 500B, 520B, 1.047T parameters), Megatron-LM (530B parameters), LLaMA ( 405B parameters) and Samba (1T parameters). Results based on GPU memory size versus memory required by the model at defined parameters, plus 10% overhead. Server manufacturers may vary configurations, yielding different results. Results may vary based on GPU memory configuration, LLM size and potential variance in GPU memory access or the server operating environment. *All data based on FP16 datatype. For FP8 = X2. For FP4 = X4. MI350-012.
4. The MLPerf name and logo are registered and unregistered trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use is strictly prohibited. For more information contact AMDBranded@tdsynnex.com.