At Computex 2024, AMD Chair and CEO Dr Lisa Su outlined an ambitious expansion of the AMD Instinct accelerator roadmap, featuring a series of new products aimed at bolstering the company’s leadership in AI innovation within data centres. Highlighting a commitment to an annual update cycle, AMD envisions significant advances in AI performance and memory capabilities with each new generation.
The roadmap commences with the AMD Instinct MI325X accelerator, slated for availability in the fourth quarter of 2024. This model is expected to feature up to 288GB of HBM3E memory, offering substantial memory capacity and bandwidth advantages over its competitors. Subsequent developments include the AMD Instinct MI350 series, powered by the new AMD CDNA 4 architecture, which is anticipated to become available in 2025. According to AMD, this series will bring a 35-fold increase in AI inference performance compared to the previous MI300 series.
Looking further ahead, the AMD Instinct MI400 series, based on the AMD CDNA Next architecture, is expected to be released in 2026. The new architectures promise to deliver significant performance and efficiency improvements for large-scale AI training and inference tasks.
Brad McCredie, AMD’s corporate vice president for Data Center Accelerated Compute, noted that the MI300X accelerators have already seen substantial adoption from partners and customers such as Microsoft Azure, Meta, Dell Technologies, HPE, Lenovo, and others. McCredie highlighted the exceptional performance and value proposition offered by these accelerators. “With our updated annual cadence of products, we are relentless in our pace of innovation, providing the leadership capabilities and performance the AI industry and our customers expect,” he stated.
The company also addressed the maturation of its AI software ecosystem. The AMD ROCm 6 open software stack is integral to driving the MI300X accelerators’ performance, particularly for popular large language models (LLMs). For instance, a server equipped with eight AMD Instinct MI300X accelerators running ROCm 6 and Meta Llama-3 70B demonstrated a 1.3x better inference performance and token generation when compared to competing products. Similarly, a single AMD Instinct MI300X accelerator running ROCm 6 achieved 1.2x better inference performance and token generation throughput on the Mistral-7B model compared to its rivals.
AMD further emphasised their collaboration with Hugging Face, a major repository for AI models, which is now testing 700,000 of its popular models nightly to ensure compatibility with AMD Instinct MI300X accelerators. In addition, AMD continues to contribute to popular AI frameworks such as PyTorch, TensorFlow, and JAX, aiming to enhance the user experience and performance of its accelerators.
During the Computex keynote, AMD revealed the new products as part of its annual cadence roadmap designed to meet increasing AI compute demands. Notably, the forthcoming MI325X accelerator will include 288GB of HBM3E memory and a peak memory bandwidth of 6 terabytes per second, employing the industry-standard Universal Baseboard server design shared with the MI300 series. The device is projected to outperform its competitors in both memory capacity and bandwidth, as well as overall compute performance.
The MI350 series, expected in 2025, will leverage the new AMD CDNA 4 architecture and utilise advanced 3nm process technology while supporting the FP4 and FP6 AI datatypes. With up to 288GB of HBM3E memory, these accelerators are expected to continue AMD’s tradition of push the envelope in AI compute capabilities.
In 2026, AMD plans to introduce the MI400 series, built upon the AMD CDNA Next architecture. These accelerators will include the latest features to unlock new levels of performance and efficiency essential for large-scale AI training and inference.