NVIDIA H200 Tensor Core GPU

Supercharge Generative AI and HPC Workloads

The NVIDIA H200 Tensor Core GPU supercharges generative AI and high-performance computing (HPC) workloads with game-changing performance and memory capabilities.

1.9X
Faster Llama2 70B
High-Performance LLM Inference
110X
Faster HPC
Supercharge High-Performance Computing
2X
Faster Inference
Reduce Energy and TCO
141GB
GPU Memory
HBM3e Memory Technology

Key Highlights

Discover the revolutionary capabilities that make H200 the ultimate choice for generative AI and HPC workloads.

High-Performance LLM Inference

In the ever-evolving landscape of AI, businesses rely on LLMs to address a diverse range of inference needs. An AI inference accelerator must deliver the highest throughput at the lowest TCO when deployed at scale for a massive user base. The H200 boosts inference speed by up to 2X compared to H100 GPUs when handling LLMs like Llama2.

Supercharge High-Performance Computing

Memory bandwidth is crucial for HPC applications as it enables faster data transfer, reducing complex processing bottlenecks. For memory-intensive HPC applications like simulations, scientific research, and artificial intelligence, the H200's higher memory bandwidth ensures that data can be accessed and manipulated efficiently, leading up to 110X faster time to results compared to CPUs.

Reduce Energy and TCO

With the introduction of the H200, energy efficiency and TCO reach new levels. This cutting-edge technology offers unparalleled performance, all within the same power profile as the H100. AI factories and supercomputing systems that are not only faster but also more eco-friendly, deliver an economic edge that propels the AI and scientific community forward.

HBM3e Memory Technology

As the first GPU with HBM3e, the H200's larger and faster memory fuels the acceleration of generative AI and large language models (LLMs) while advancing scientific computing for HPC workloads.

Technical Specifications

Comprehensive technical specifications for the NVIDIA H200 Tensor Core GPU.

Technical DetailH200 SXM¹H200 NVL¹
Technical DetailH200 SXM¹H200 NVL¹
FP6434 TFLOPS30 TFLOPS
FP64 Tensor Core67 TFLOPS60 TFLOPS
FP3267 TFLOPS60 TFLOPS
TF32 Tensor Core²989 TFLOPS835 TFLOPS
BFLOAT16 Tensor Core²1,979 TFLOPS1,671 TFLOPS
FP16 Tensor Core²1,979 TFLOPS1,671 TFLOPS
FP8 Tensor Core²3,958 TFLOPS3,341 TFLOPS
INT8 Tensor Core²3,958 TFLOPS3,341 TFLOPS
GPU Memory141GB141GB
GPU Memory | Bandwidth4.8TB/s4.8TB/s
Decoders7 NVDEC 7 JPEG7 NVDEC7 JPEG
Confidential ComputingSupportedSupported
Max Thermal Design Power (TDP)Up to 700W (configurable)Up to 600W (configurable)
Multi-Instance GPUsUp to 7 MIGs @18GB eachUp to 7 MIGs @16.5GB each
Form FactorSXMPCIe Dual-slot air-cooled
InterconnectNVIDIA NVLink™: 900GB/s PCIe Gen5: 128GB/s2- or 4-way NVIDIA NVLink bridge: 900GB/s per GPU PCIe Gen5: 128GB/s
Server OptionsNVIDIA HGX™ H200 partner and NVIDIA-Certified Systems™ with 4 or 8 GPUsNVIDIA MGX™ H200 NVL partner and NVIDIA-Certified Systems with up to 8 GPUs
NVIDIA AI EnterpriseAdd-onIncluded

1 Preliminary specifications. May be subject to change. 2 With sparsity.

Experience Next-Level Performance

Experience the power of next-generation AI computing with NVIDIA H200. Contact our experts to learn how this revolutionary GPU can accelerate your AI and HPC initiatives.