NVIDIA® A2

NVIDIA A2

SKU: TCSA2M-PB

Where to Buy

Description

NVIDIA A2

Unprecedented Acceleration for World’s Highest-Performing Elastic Data Centers

The NVIDIA A2 Tensor Core GPU provides entry-level inference with low power, a small footprint, and high performance for intelligent video analytics (IVA) or NVIDIA AI at the edge. Featuring a low-profile PCIe Gen4 card and a low 40–60 watt (W) configurable thermal design power (TDP) capability, the A2 brings versatile inference acceleration to any server.

A2’s versatility, compact size, and low power exceed the demands for edge deployments at scale, instantly upgrading existing entry-level CPU servers to handle inference. Servers accelerated with A2 GPUs deliver up to 20X higher inference performance versus CPUs and 1.3x more efficient IVA deployments than previous GPU generations — all at an entry-level price point.

NVIDIA-Certified systems with the NVIDIA A2, A30, and A100 Tensor Core GPUs and NVIDIA AI—including the NVIDIA Triton Inference Server, open source inference service software—deliver breakthrough inference performance across edge, data center, and cloud. They ensure that AI-enabled applications deploy with fewer servers and less power, resulting in easier deployments and faster insights with dramatically lower costs.

Highlights

GPU Architecture	NVIDIA Ampere
CUDA Cores	1280
Tensor Cores	40 \| Gen 3
RT Cores	108 Gen 2
Peak FP32	4.5 TFLOPS
Peak TF32 Tensor Core	9 TFLOPS \| 18 TFLOPS Sparsity
Peak FP16 Tensor Core	18 TFLOPS \| 36 TFLOPS Sparsity
INT8	36 TOPS \| 72 TOPS Sparsity
INT4	72 TOPS \| 144 TOPS Sparsity
GPU Memory	16 GB GDDR6 ECC
Memory Bandwidth	200 GB/s
Thermal Solution	Passive
Maximum Power Consumption	40-60 Watt \| Configurable
System Interface	PCIe Gen 4.0 x8

Third-Generation NVIDIA Tensor Cores

The third-generation Tensor Cores in NVIDIA A2 support integer math down to INT4 and floating-point math up to FP32 to deliver high AI training and inference performance. A2’s NVIDIA Ampere architecture also supports TF32 and NVIDIA’s automatic mixed precision (AMP) capabilities.

Second-Generation RT Cores

The NVIDIA A2 GPU includes dedicated RT Cores for ray tracing and Tensor Cores for AI to power groundbreaking results at breakthrough speed. It delivers up to 2x the throughput over the previous generation and the ability to concurrently run ray tracing with either shading or denoising capabilities.

Structural Sparsity

Modern AI networks are big and getting bigger, with millions to billions of parameters. Not all of these parameters are needed for accurate predictions and inference. A2 provides up to 2x higher compute performance for sparse models compared to previous-generation GPUs. This feature readily benefits AI inference and can be used to improve the performance of model training.

Hardened Root of Trust for Secure Deployments

Providing security in edge deployments and end points is critical for enterprise business operations. The NVIDIA A2 GPU delivers secure boot through trusted code authentication and hardened rollback protections against malicious malware attacks, preventing operational losses and ensuring workload acceleration.

Superior Hardware Transcoding Performance

Real-time performance is critical in IVA (Internet Video Analytics) at the edge, requiring the latest in hardware encode and decode capabilities. NVIDIA A2 GPUs use dedicated hardware to fully accelerate video decoding and encoding for the most popular codecs, including H.265, H.264, and VP9, as well as AV1 decode for real-time video processing.

Warranty

3-Year Limited Warranty

Resources

Product Brochure
Product Brief

Links

Resource Center
NVIDIA GPU Accelerated Applications Catalog

Contact pnypro@pny.eu for additional information.