PNY Technologies Inc.
  • A30_Front.png
  • A30-3QTR-Top-Left.png
  • A30_3QTR-Top-Right.png



  • Description

    NVIDIA A30

    Versatile Compute Acceleration for Mainstream Enterprise Servers

    Bring accelerated performance to every enterprise workload with NVIDIA A30 Tensor Core GPUs. With NVIDIA Ampere architecture Tensor Cores and Multi-Instance GPU (MIG), it delivers speedups securely across diverse workloads, including AI inference at scale and high-performance computing (HPC) applications. By combining fast memory bandwidth and low-power consumption in a PCIe form factor optimized for mainstream servers, A30 enables an elastic data center and delivers maximum value for enterprises.

    The NVIDIA A30 Tensor Core GPU delivers a versatile platform for mainstream enterprise workloads, like AI inference, training, and HPC. With TF32 and FP64 Tensor Core support, as well as an end-to-end software and hardware solution stack, A30 ensures that mainstream AI training and HPC applications can be rapidly addressed. Multi-instance GPU (MIG) ensures quality of service (QoS) with secure, hardware-partitioned, right-sized GPUs across all of these workloads for diverse users, optimally utilizing GPU compute resources.

    NVIDIA A30


    CUDA Cores 3804
    Tensor Cores 224
    Peak FP64 5.2 TFLOPS
    Peak FP64 Tensor Core 10.3 TFLOPS
    Peak FP32 10.3 TFLOPS
    TF32 Tensor Core 82 TFLOPS | 165 TFLOPS*
    BFLOAT16 Tensor Core 165 TFLOPS | 330 TFLOPS*
    Peak FP16 Tensor Core 165 TFLOPS | 330 TFLOPS*
    Peak INT8 Tensor Core 330 TOPS | 661 TOPS*
    GPU Memory 24 GB HBM2
    Memory Bandwidth 933 GB/s
    Thermal Solutions Passive
    Maximum Power Consumption 165 W
    System Interface PCIe Gen 4.0 | 64 GB/s
    Multi-Instance GPU Support Yes
    vGPU Support Yes

    *With sparsity

    Deep Learning Training

    • A30 leverages groundbreaking features to optimize inference workloads. It accelerates a full range of precisions, from FP64 to TF32 and INT4. Supporting up to four MIGs per GPU, A30 lets multiple networks operate simultaneously in secure hardware partitions with guaranteed quality of service (QoS). Structural sparsity support delivers up to 2x more performance on top of A30’s other inference performance gains. NVIDIA’s market-leading AI performance was demonstrated in MLPerf Inference. Combined with the NVIDIA Triton Inference Server, which easily deploys AI at scale, A30 brings this groundbreaking performance to every enterprise.

    Enterprise -Ready Utilization

    • A30 with MIG maximizes the utilization of GPU accelerated infrastructure. With MIG, an A30 GPU can be partitioned into as many as four independent instances, giving multiple users access to GPU acceleration. MIG works with Kubernetes, containers, and hypervisor-based server virtualization. MIG lets infrastructure managers offer a right-sized GPU with guaranteed QoS for every job, extending the reach of accelerated computing resources to every user.

    High Performance Data Analytics

    • Data scientists need to be able to analyze, visualize, and turn massive datasets into insights. But scale-out solutions are often bogged down by datasets scattered across multiple servers. Accelerated servers with A30 provide the needed compute power – along with large HBM2 memory, 933 GB/s of memory bandwidth, and scalability with NVLink – to tackle these workloads. Combined with NVIDIA InfiniBand, NVIDIA Magnum IO and the RAPIDS site of open-source libraries, including the RAPIDS Accelerator for Apache Spark, the NVIDIA data center platform accelerates these huge workloads at unprecedented levels of performance and efficiency.

    vGPU Software Support

    • NVIDIA Virtual PC (vPC)
    • NVIDIA Virtual Applications (vApps)
    • NVIDIA RTX Virtual Workstation (vWS)
    • NVIDIA Virtual Compute Server (vCS)
    • vGPU Profiles from 1 GB to 24 GB
    Warranty Sheild Icon


    3-Year Limited Warranty

    Free dedicated phone and email technical support

    Dedicated NVIDIA professional products Field Application Engineers

    Contact for additional information.

  • Features

    NVIDIA A30


    AI Inference and Mainstream Compute for Every Enterprise

    NVIDIA A30 Tensor Core GPU is the most versatile mainstream compute GPU for AI inference and mainstream enterprise workloads. Powered by NVIDIA Ampere architecture Tensor Core technology, it supports a broad range of math precisions, providing a single accelerator to speed up every workload.

    Built for AI Inference at Scale

    The same compute resource can rapidly re-train AI models with TF32, as well as accelerate high-performance computing (HPC) applications using FP64 Tensor Cores. Multi-Instance GPU (MIG) and FP64 Tensor Cores combine with fast 933 gigabytes per second (GB/s) of memory bandwidth in a low 165 W power envelope, all running on a PCIe card optimal for mainstream servers.

    Quality of Service Across Diverse Workloads

    The combination of third-generation Cores and MIG delivers secure quality of service across diverse workloads, all powered by a versatile GPU enabling an elastic data center. A30’s versatile compute capabilities across big and small workloads deliver maximum value for mainstream enterprises.

    Part of NVIDIA’s Data Center Solution

    A30 is part of the complete NVIDIA data center solution that incorporates building blocks across hardware, networking, software, libraries, and optimized AI models and applications from NGC (NVIDIA GPU Cloud). Representing the most powerful end-to-end AI and HPC platform for data centers, it allows researchers to deliver real-world results and deploy solutions into production at scale.

    PCIe Gen 4

    The NVIDIA A30 supports PCI Express Gen 4, which provides double the bandwidth of PCIe Gen 3, improving data-transfer speeds from CPU memory for data-intensive tasks like AI and data science.

    High Speed HBM2 Memory

    With 24 gigabytes (GB) of high-bandwidth memory (HBM2e), the NVIDIA A30 PCIe delivers improved raw bandwidth of 933 GB/s, as well as higher dynamic random access memory (DRAM) utilization efficiency at 95 percent.

    Error Correction Without a Performance or Capacity Hit

    HBM2 memory implements error correction without any performance (bandwidth) or capacity hit, unlike competing technologies like GDDR6 or GDDR6X.

    Compute Preemption

    Preemption at the instruction-level provides finer grain control over compute and tasks to prevent longer-running applications from either monopolizing system resources or timing out.


    Third Generation NVLink

    Connect two NVIDIA A30 PCIe boards with NVLink to double the effective memory footprint and scale application performance by enabling GPU-to-GPU data transfers at rates up to 200 GB/s of bidirectional bandwidth. NVLink bridges are available for motherboards with standard or wide slot spacing.


    Virtual GPU Software for Virtualization

    NVIDIA AI Enterprise for VMware and support for NVIDIA Virtual Compute Server (vCS) accelerates virtualized compute workloads such as high-performance computing, AI, data science, big-data analytics, and HPC applications.

    Software Optimized for AI

    Deep learning frameworks such as Caffe2, MXNet, CNTK, TensorFlow, and others deliver dramatically faster training times and higher multi-node training performance. GPU accelerated libraries such as cuDNN, cuBLAS, and TensorRT deliver higher performance for both deep learning inference and High-Performance Computing (HPC) applications.

    NVIDIA CUDA Parallel Computing Platform

    Natively execute standard programming languages like C/C++ and Fortran, and APIs such as OpenCL, OpenACC and Direct Compute to accelerates techniques such as ray tracing, video and image processing, and computation fluid dynamics.

    Unified Memory

    A single, seamless 49-bit virtual address space allows for the transparent migration of data between the full allocation of CPU and GPU memory.

  • Specifications

    NVIDIA A30


    Compatible in all systems that accept an NVIDIA A30

    CUDA Cores 3804
    Tensor Cores 224
    Peak FP64 5.2 TFLOPS
    Peak FP64 Tensor Core 10.3 TFLOPS
    Peak FP32 10.3 TFLOPS
    TF32 Tensor Core 82 TFLOPS | 165 TFLOPS*
    BFLOAT16 Tensor Core 165 TFLOPS | 330 TFLOPS*
    Peak FP16 Tensor Core 165 TFLOPS | 330 TFLOPS*
    Peak INT8 Tensor Core 330 TOPS | 661 TOPS*
    GPU Memory 24 GB HBM2
    Memory Bandwidth 933 GB/s
    Media Engines 1 Optical Flow Accelerator (OFA)
    1 JPEG Decoder (NVJPEG)
    4 Video Decoders (NVDEC)
    Thermal Solutions Passive
    Maximum Power Consumption 165 W
    System Interface PCIe Gen 4.0 | 64 GB/s
    NVLink Third-Generation | 200 GB/s Bidirectional/td>
    Form Factor 2-Slot, Full Height, Full Length (FHFL)
    Multi-Instance GPU Support 4 MIGs at 6 GB Each
    2 MIGs at 12 GB Each
    1 MIG at 24 GB
    vGPU Support NVIDIA AI Enterprise for VMWare
    NVIDIA Virtual Compute Server

    *With sparsity


    • RTXA6000NVLINK-KIT provides an NVLink connector for A30 suitable for standard PCIe slot spacing motherboards. Application support is required. All NVIDIA Ampere architecture-based PCIe boards (Data Center or Professional Graphics) utilize the same NVLink bridges.


    • Windows Server 2012 R2
    • Windows Server 2016 1607, 1709
    • Windows Server 2019
    • RedHat CoreOS 4.7
    • Red Hat Enterprise Linux 8.1-8.3
    • Red Hat Enterprise Linux 7.7-7.9
    • Red Hat Linux 6.6+
    • SUSE Linux Enterprise Server 15 SP2
    • SUSE Linux Enterprise Server 12 SP 3+
    • Ubuntu 14.04 LTS/16.04/18.04 LTS/20.04 LTS


    • NVIDIA A30 Data Center Tensor Core PCIe Board
    • Auxiliary power cable