Extraordinary Performance, Scalability, and Security for Every Data Center
The NVIDIA H100 Tensor Core GPU delivers exceptional performance, scalability, and security for every workload. H100 uses breakthrough innovations based on the NVIDIA Hopper™ architecture to deliver industry-leading conversational AI, speeding up large language models (LLMs) by 30X.
Key Highlights
Discover the revolutionary capabilities that make H100 the ultimate choice for AI training, inference, and HPC workloads.
Transformational AI Training
H100 features fourth-generation Tensor Cores and a Transformer Engine with FP8 precision that provides up to 4X faster training over the prior generation for GPT-3 (175B) models. The combination of fourth-generation NVLink, NDR Quantum-2 InfiniBand networking, PCIe Gen5, and NVIDIA Magnum IO™ software delivers efficient scalability from small enterprise systems to massive, unified GPU clusters.
Real-Time Deep Learning Inference
AI solves a wide array of business challenges, using an equally wide array of neural networks. A great AI inference accelerator has to not only deliver the highest performance but also the versatility to accelerate these networks. H100 extends NVIDIA's market-leading inference leadership with several advancements that accelerate inference by up to 30X and deliver the lowest latency.
Exascale High-Performance Computing
The NVIDIA data center platform consistently delivers performance gains beyond Moore's law. And H100's new breakthrough AI capabilities further amplify the power of HPC+AI to accelerate time to discovery for scientists and researchers working on solving the world's most important challenges. H100 triples the floating-point operations per second (FLOPS) of double-precision Tensor Cores, delivering 60 teraflops of FP64 computing for HPC.
Accelerated Data Analytics
Data analytics often consumes the majority of time in AI application development. Since large datasets are scattered across multiple servers, scale-out solutions with commodity CPU-only servers get bogged down by a lack of scalable computing performance. Accelerated servers with H100 deliver the compute power—along with 3 terabytes per second (TB/s) of memory bandwidth per GPU and scalability with NVLink and NVSwitch™—to tackle data analytics with high performance and scale to support massive datasets.
Enterprise-Ready Utilization
IT managers seek to maximize utilization (both peak and average) of compute resources in the data center. They often employ dynamic reconfiguration of compute to right-size resources for the workloads in use. H100 with MIG lets infrastructure managers standardize their GPU-accelerated infrastructure while having the flexibility to provision GPU resources with greater granularity.
Built-In Confidential Computing
Traditional Confidential Computing solutions are CPU-based, which is too limited for compute-intensive workloads such as AI at scale. NVIDIA Confidential Computing is a built-in security feature of the NVIDIA Hopper architecture that made H100 the world's first accelerator with these capabilities.
Exceptional Performance for Large-Scale AI and HPC
The Hopper Tensor Core GPU will power the NVIDIA Grace Hopper CPU+GPU architecture, purpose-built for terabyte-scale accelerated computing and providing 10X higher performance on large-model AI and HPC. The NVIDIA Grace CPU leverages the flexibility of the Arm® architecture to create a CPU and server architecture designed from the ground up for accelerated computing.
Technical Specifications
Comprehensive technical specifications for the NVIDIA H100 Tensor Core GPU.
| Technical Detail | H100 SXM | H100 NVL |
|---|---|---|
| Technical Detail | H100 SXM | H100 NVL |
| FP64 | 34 teraFLOPS | 30 teraFLOPs |
| FP64 Tensor Core | 67 teraFLOPS | 60 teraFLOPs |
| FP32 | 67 teraFLOPS | 60 teraFLOPs |
| TF32 Tensor Core* | 989 teraFLOPS | 835 teraFLOPs |
| BFLOAT16 Tensor Core* | 1,979 teraFLOPS | 1,671 teraFLOPS |
| FP16 Tensor Core* | 1,979 teraFLOPS | 1,671 teraFLOPS |
| FP8 Tensor Core* | 3,958 teraFLOPS | 3,341 teraFLOPS |
| INT8 Tensor Core* | 3,958 TOPS | 3,341 TOPS |
| GPU Memory | 80GB | 94GB |
| GPU Memory | Bandwidth | 3.35TB/s | 3.9TB/s |
| Decoders | 7 NVDEC 7 JPEG | 7 NVDEC 7 JPEG |
| Max Thermal Design Power (TDP) | Up to 700W (configurable) | 350-400W (configurable) |
| Multi-Instance GPUs | Up to 7 MIGS @ 10GB each | Up to 7 MIGS @ 12GB each |
| Form Factor | SXM | PCIe dual-slot air-cooled |
| Interconnect | NVIDIA NVLink™: 900GB/s PCIe Gen5: 128GB/s | NVIDIA NVLink: 600GB/s PCIe Gen5: 128GB/s |
| Server Options | NVIDIA HGX H100 Partner and NVIDIA-Certified Systems™ with 4 or 8 GPUs NVIDIA DGX H100 with 8 GPUs | Partner and NVIDIA-Certified Systems with 1–8 GPUs |
| NVIDIA AI Enterprise | Add-on | Included |
* With sparsity
Extraordinary Performance for Every Data Center
Experience the power of industry-leading AI computing with NVIDIA H100. Contact our experts to learn how this revolutionary GPU can accelerate your AI and HPC initiatives.