Technology Directions for Future AI Inference Systems
AI inference systems are undergoing continuous changes. Changes include at the model level, the accelerator level, the interconnect level and the system level inclusive of the software. At an AI inference system level these changes are significant. We focus on technology direction for these changes at each level, including the accelerator, the scale up/scale-out network and the rack implementation. We discuss on accelerator design choices, especially opportunities in co-design, interconnect technologies such as the intercept at 224G copper and future 448G as well as optics interconnect, and rack infrastructures for LLM deployments.
