On Monday, Nvidia announced the HGX H200 Tensor Core GPU, which utilizes the Hopper architecture to accelerate AI applications. It’s a follow-up of the H100 GPU, released last year and previously Nvidia’s most powerful AI GPU chip. If widely deployed, it could lead to far more powerful AI models—and faster response times for existing ones like ChatGPT—in the near future.
According to experts, lack of computing power (often called “compute”) has been a major bottleneck of AI progress this past year, hindering deployments of existing AI models and slowing the development of new ones. Shortages of powerful GPUs that accelerate AI models are largely to blame. One way to alleviate the compute bottleneck is to make more chips, but you can also make AI chips more powerful. That second approach may make the H200 an attractive product for cloud providers.
What’s the H200 good for? Despite the “G” in the “GPU” name, data center GPUs like this typically aren’t for graphics. GPUs are ideal for AI applications because they perform vast numbers of parallel matrix multiplications, which are necessary for neural networks to function. They are essential in the training portion of building an AI model and the “inference” portion, where people feed inputs into an AI model and it returns results.
Read 7 remaining paragraphs | Comments