NVIDIA A16
PERFORMANCE AND USEABILITY FEATURES
CUDA Cores
The NVIDIA® Ampere architecture's CUDA cores bring up to 2.5x the single-precision floating point (FP32) throughput compared to the previous generation, providing significant performance improvements for graphics workflows such as 3D model development and compute for workloads such as desktop simulation for computer-aided engineering (CAE).
2nd Generation RT Cores
Incorporating second generation ray tracing engines, the NVIDIA Ampere GPU architecture provides incredible ray traced rendering performance. NVIDIA A16 can render complex professional models with physically accurate shadows, reflections, and refractions to empower users with instant insight. Working in concert with applications leveraging APIs such as NVIDIA OptiX, Microsoft DXR and Vulkan ray tracing, servers based on NVIDIA A16 will power truly interactive design workflows to provide immediate feedback for unprecedented levels of productivity.
3rd Generation Tensor Cores
Purpose-built for deep learning matrix arithmetic at the heart of neural network training and inferencing functions, the NVIDIA A16 includes enhanced Tensor Cores that accelerate more datatypes (TF32 and BF16) and includes a new Fine-Grained Structured Sparsity feature that delivers up to 2x throughput for tensor matrix operations compared to the previous generation.
PCIe Gen 4
The NVIDIA A16 supports PCI Express Gen 4, which provides double the bandwidth of PCIe Gen 3, improving data-transfer speeds from CPU memory for data-intensive tasks like AI and data science.
Higher Speed GDDR6 Memory
Built with 64 GB GDDR6 memory total (4x 16 GB) delivering higher bandwidth throughput for ray tracing, rendering, and AI workloads than the previous generation. The NVIDIA A16 provides the graphics memory footprint to address the real-world datasets and models in latency-sensitive professional applications.
5th Generation NVDEC Engine
NVDEC is well suited for transcoding and video playback applications for real-time decoding. The following video codecs are supported for hardware-accelerated decoding: MPEG-2, VC-1, H.264 (AVCHD), H.265 (HEVC), VP8, VP9, and AV1.
SOFTWARE SUPPORT
Virtual GPU Software for Virtualization
Support for NVIDIA virtual GPU (vGPU) software enables A16 to be virtualized to accelerate high-end design, AI, and compute workloads. The NVIDIA RTX Virtual Workstation (vWS) license provides access to the world's most powerful virtual workstations to enable flexible, work-from-anywhere solutions, while the NVIDIA Virtual Compute Server (vCS) license accelerates virtualized compute workloads such as high-performance computing, AI and data science.
Software Optimized for AI
Deep learning frameworks such as Caffe2, MXNet, CNTK, TensorFlow, and others deliver dramatically faster training times and higher multi-node training performance. GPU accelerated libraries such as cuDNN, cuBLAS, and TensorRT delivers higher performance for both deep learning inference and High-Performance Computing (HPC) applications.
NVIDIA CUDA Parallel Computing Platform
Natively execute standard programming languages like C/C++ and Fortran, and APIs such as OpenCL, OpenACC and Direct Compute to accelerates techniques such as ray tracing, video and image processing, and computation fluid dynamics.