AI Inference and Mainstream Compute for Every Enterprise
NVIDIA A30 Tensor Core GPU is the most versatile mainstream compute GPU for AI inference and mainstream enterprise workloads. Powered by NVIDIA Ampere architecture Tensor Core technology, it supports a broad range of math precisions, providing a single accelerator to speed up every workload.
Built for AI Inference at Scale
The same compute resource can rapidly re-train AI models with TF32, as well as accelerate high-performance computing (HPC) applications using FP64 Tensor Cores. Multi-Instance GPU (MIG) and FP64 Tensor Cores combine with fast 933 gigabytes per second (GB/s) of memory bandwidth in a low 165 W power envelope, all running on a PCIe card optimal for mainstream servers.
Quality of Service Across Diverse Workloads
The combination of third-generation Cores and MIG delivers secure quality of service across diverse workloads, all powered by a versatile GPU enabling an elastic data center. A30’s versatile compute capabilities across big and small workloads deliver maximum value for mainstream enterprises.
Part of NVIDIA’s Data Center Solution
A30 is part of the complete NVIDIA data center solution that incorporates building blocks across hardware, networking, software, libraries, and optimized AI models and applications from NGC (NVIDIA GPU Cloud). Representing the most powerful end-to-end AI and HPC platform for data centers, it allows researchers to deliver real-world results and deploy solutions into production at scale.
PCIe Gen 4
The NVIDIA A30 supports PCI Express Gen 4, which provides double the bandwidth of PCIe Gen 3, improving data-transfer speeds from CPU memory for data-intensive tasks like AI and data science.
High Speed HBM2 Memory
With 24 gigabytes (GB) of high-bandwidth memory (HBM2e), the NVIDIA A30 PCIe delivers improved raw bandwidth of 933 GB/s, as well as higher dynamic random access memory (DRAM) utilization efficiency at 95 percent.
Error Correction Without a Performance or Capacity Hit
HBM2 memory implements error correction without any performance (bandwidth) or capacity hit, unlike competing technologies like GDDR6 or GDDR6X.
Compute Preemption
Preemption at the instruction-level provides finer grain control over compute and tasks to prevent longer-running applications from either monopolizing system resources or timing out.
Virtual GPU Software for Virtualization
NVIDIA AI Enterprise for VMware and support for NVIDIA Virtual Compute Server (vCS) accelerates virtualized compute workloads such as high-performance computing, AI, data science, big-data analytics, and HPC applications.
Software Optimized for AI
Deep learning frameworks such as Caffe2, MXNet, CNTK, TensorFlow, and others deliver dramatically faster training times and higher multi-node training performance. GPU accelerated libraries such as cuDNN, cuBLAS, and TensorRT deliver higher performance for both deep learning inference and High-Performance Computing (HPC) applications.
NVIDIA CUDA Parallel Computing Platform
Natively execute standard programming languages like C/C++ and Fortran, and APIs such as OpenCL, OpenACC and Direct Compute to accelerates techniques such as ray tracing, video and image processing, and computation fluid dynamics.
Unified Memory
A single, seamless 49-bit virtual address space allows for the transparent migration of data between the full allocation of CPU and GPU memory.