NVIDIA H100 NVL
Unprecedented Performance, Scalability, and Security for Every Data Center
H100 NVL is designed to scale support of Large Language Models in mainstream PCIe-based server systems. With increased raw performance, bigger, faster HBM3 memory and NVLink connectivity via bridges, mainstream systems configured with 8x H100 NVL outperform HGX A100 systems by up to 12X on GPT3-175B LLM throughput.
H100 NVL enables standard mainstream servers to provide high-performance large language model generative AI inference while enabling partners and solution providers the fastest time to market and ease of scale out.
Performance Highlights
FP64
|
68 TFLOPS |
FP64 Tensor Core
|
134 TFLOPS |
FP32
|
134 TFLOPS |
TF32 Tensor Core
|
1979 TFLOPS | Sparsity |
BFLOAT16 Tensor Core
|
3958 TFLOPS | Sparsity |
FP16 Tensor Core
|
3958 TFLOPS | Sparsity |
FP8 Tensor Core
|
7916 TFLOPS | Sparsity |
INT8 Tensor Core
|
7916 TOPS | Sparsity |
GPU Memory
|
188GB HBM3 |
GPU Memory Bandwidth
|
3938 GB/sec |
Maximum Power Consumption
|
2x 350-400W (Configurable) |
Warranty
Free dedicated phone and email technical support
(1-800-230-0130)
Dedicated NVIDIA professional products Field Application Engineers
Resources
Contact gopny@pny.com for additional information.