• 1-NVIDIA-A2-fr.png
  • 2-NVIDIA-A2-3QTR.png
  • 3-NVIDIA-A2-top.png
  • 4-NVIDIA-A2-3QTR-2.png
  • 5-NVIDIA-A2-3QTR-3.png
  • 6-NVIDIA-A2-3QTR-4.png




Where to Buy

  • Description


    Unprecedented Acceleration for World’s Highest-Performing Elastic Data Centers

    The NVIDIA A2 Tensor Core GPU provides entry-level inference with low power, a small footprint, and high performance for intelligent video analytics (IVA) or NVIDIA AI at the edge. Featuring a low-profile PCIe Gen4 card and a low 40–60 watt (W) configurable thermal design power (TDP) capability, the A2 brings versatile inference acceleration to any server.

    A2’s versatility, compact size, and low power exceed the demands for edge deployments at scale, instantly upgrading existing entry-level CPU servers to handle inference. Servers accelerated with A2 GPUs deliver up to 20X higher inference performance versus CPUs and 1.3x more efficient IVA deployments than previous GPU generations — all at an entry-level price point.

    NVIDIA-Certified systems with the NVIDIA A2, A30, and A100 Tensor Core GPUs and NVIDIA AI—including the NVIDIA Triton Inference Server, open source inference service software—deliver breakthrough inference performance across edge, data center, and cloud. They ensure that AI-enabled applications deploy with fewer servers and less power, resulting in easier deployments and faster insights with dramatically lower costs.



    GPU Architecture NVIDIA Ampere
    CUDA Cores 1280
    Tensor Cores 40 | Gen 3
    RT Cores 108 Gen 2
    Peak FP32 4.5 TFLOPS
    Peak TF32 Tensor Core 9 TFLOPS | 18 TFLOPS Sparsity
    Peak FP16 Tensor Core 18 TFLOPS | 36 TFLOPS Sparsity
    INT8 36 TOPS | 72 TOPS Sparsity
    INT4 72 TOPS | 144 TOPS Sparsity
    GPU Memory 16 GB GDDR6 ECC
    Memory Bandwidth 200 GB/s
    Thermal Solution Passive
    Maximum Power Consumption 40-60 Watt | Configurable
    System Interface PCIe Gen 4.0 x8

    Third-Generation NVIDIA Tensor Cores

    • The third-generation Tensor Cores in NVIDIA A2 support integer math down to INT4 and floating-point math up to FP32 to deliver high AI training and inference performance. A2’s NVIDIA Ampere architecture also supports TF32 and NVIDIA’s automatic mixed precision (AMP) capabilities.

    Second-Generation RT Cores

    • The NVIDIA A2 GPU includes dedicated RT Cores for ray tracing and Tensor Cores for AI to power groundbreaking results at breakthrough speed. It delivers up to 2x the throughput over the previous generation and the ability to concurrently run ray tracing with either shading or denoising capabilities.

    Structural Sparsity

    • Modern AI networks are big and getting bigger, with millions to billions of parameters. Not all of these parameters are needed for accurate predictions and inference. A2 provides up to 2x higher compute performance for sparse models compared to previous-generation GPUs. This feature readily benefits AI inference and can be used to improve the performance of model training.

    Hardened Root of Trust for Secure Deployments

    • Providing security in edge deployments and end points is critical for enterprise business operations. The NVIDIA A2 GPU delivers secure boot through trusted code authentication and hardened rollback protections against malicious malware attacks, preventing operational losses and ensuring workload acceleration.

    Superior Hardware Transcoding Performance

    • Real-time performance is critical in IVA (Internet Video Analytics) at the edge, requiring the latest in hardware encode and decode capabilities. NVIDIA A2 GPUs use dedicated hardware to fully accelerate video decoding and encoding for the most popular codecs, including H.265, H.264, and VP9, as well as AV1 decode for real-time video processing.


    3-Year Limited Warranty

    Dedicated NVIDIA professional products Field Application Engineers

    Contact for additional information.

Related Products