AMD today announced the addition of the Instinct MI200 family of accelerators. Officially named the Instinct MI210 accelerator, AMD is trying to bring exascale-class technologies to mainstream HPC and AI users with this model.
Based on the CDNA2 compute architecture built for heavy HPC and AI workloads, the card has 104 Compute Units (CUs), for a total of 6656 Stream Processors (SPs). At a peak clock speed of 1700 MHz, the card can deliver 181 teraflops of half-precision FP16 peak compute, 22.6 teraflops of single-precision FP32 peak compute, and 22.6 teraflops of double-precision FP62 peak compute. For calculations with a single precision matrix (FP32), the card can provide a peak value of 45.3 TFLOP. The INT4/INT8 precision settings provide 181 TOPs, while the MI210 can calculate the bfloat16 precision format with 181 TeraFLOPs at the peak.
The card uses a 4096-bit memory interface connecting 64GB HMB2e to compute silicon. The total memory bandwidth is 1638.4 GB / s, and the memory modules operate at a frequency of 1.6 GHz. It is important to note that ECC is supported throughout the chip. AMD provides the Instinct MI210 accelerator as a PCIe device based on the PCIe 4.0 standard. The card is rated at 300W TDP and is passively cooled.