NVIDIA®L4

NVIDIA L4

SKU: TCSL4PCIE-PB

Where to Buy

Description

NVIDIA L4

Breakthrough Universal Accelerator for Efficient Video, AI, and Graphics

With NVIDIA’s AI platform and full-stack approach, L4 is optimized for video and inference at scale for a broad range of AI applications, including recommendations, voice-based AI avatar assistants, generative AI, visual search, and contact center automation to deliver the best personalized experiences. As the most efficient NVIDIA accelerator for mainstream, servers equipped with L4 enable up to 120X higher AI Video performance over CPU solutions, while providing 2.7X more generative AI performance, and over 4X more graphics performance versus the previous generation. NVIDIA L4’s versatility and energy-efficient, single-slot, low-profile form factor make it ideal for global deployments, including edge locations.

As AI and video become more pervasive, the demand for efficient, cost effective computing is increasing more than ever. NVIDIA L4 GPUs deliver up to 99% better energy efficiency and lower total cost of ownership compared to traditional CPU-based infrastructure. This enables enterprises to reduce rack space and significantly lower the overall carbon footprint while making their data centers capable of scaling to many more users. The energy saved by switching from CPUs to NVIDIA L4s in a 2 megawatt (MW) data center can power over 2,000 homes for one year or the carbon offset from 172,000 trees grown over 10 years*.

Highlights
FP32	30.3 TFLOPS
TF32 Tensor Core	120 TFLOPS \| Sparsity
FP16 Tensor Core	242 TFLOPS \| Sparsity
BFLOAT16 Tensor Core	242 TFLOPS \| Sparsity
FP8Tensor Core	485 TFLOPS \| Sparsity
INT8 Tensor Core	485 TOPS \| Sparsity
GPU Memory	24GB
GPU Memory Bandwidth	300 GB/s
Display Connectors	None \| vGPU Only
NVENC \| NVDEC	2 \| 4 \| 4\| JPEG Decoders \| AV1 Encode and Decode
Form Factor	Single Slot \| Low-Profile
System Interconnect	PCIe Gen4 x16 \| 64GB/s
Thermal Solution	Passive
Maximum Power Consumption	72 W
Server Options	Partner and NVIDIA-Certified Systems with 1-8 GPUs

NVIDIA L4 Ada Lovelace Architecture Features

Fourth-Generation Tensor Cores

The new Ada Lovelace architecture Tensor Cores are designed to accelerate transformative AI technologies like intelligent chatbots, generative AI, natural language processing (NLP), computer vision, and NVIDIA Deep Learning Super Sampling 3.0 (DLSS 3). Ada Lovelace Tensor Cores unleash structured sparsity and 8-bit floating point (FP8) precision for up to 4x higher inference performance over the previous generation¹. FP8 reduces memory pressure when compared to larger precisions and dramatically accelerates AI throughput.

Third-Generation RT Cores

NVIDIA made real-time ray tracing a reality with the invention of RT Cores, processing cores on the GPU specifically designed to tackle performance-intensive ray-tracing rendering. Ada Lovelace’s third-generation RT Cores have twice the ray-triangle intersection throughput, increasing RT-TFLOP performance by over 2x. NVIDIA Shader Execution Reordering (SER) improves performance over 3x, enabling deep immersive experiences for virtual worlds and unprecedented productivity for AI-based neural graphics and cloud gaming.

Advanced Video and Vision AI Acceleration

With an optimized AV1 stack, NVIDIA L4 takes video and vision AI acceleration to the next level, creating a broad array of new possibilities for use cases like real-time video transcoding, streaming, video conferencing, augmented reality (AR), virtual reality (VR), and vision AI. With four video decoders and two video encoders, combined with the AV1 video format, L4 servers can host over 1,000² concurrent video streams and over 120X more AI video end-to-end pipeline performance than CPU solutions³. On top of this, four JPEG decoders further speed up applications that need computer vision horsepower.

1. L4’s FP8 compared to T4’s FP16.
2. 8x L4 AV1 low-latency P1 preset encode at 720p30.
3. 8x L4 vs 2S Intel 8362 CPU server performance comparison: end-to-end video pipeline with CV-CUDA pre- and postprocessing, decode, inference (SegFormer), encode, TRT 8.6 vs CPU only pipeline using OpenCV.

Warranty

3-Year Limited Warranty

Resources

Contact pnypro@pny.eu for additional information.

NVIDIA®L4

NVIDIA L4

NVIDIA L4

Breakthrough Universal Accelerator for Efficient Video, AI, and Graphics

Highlights

FP32

TF32 Tensor Core

FP16 Tensor Core

BFLOAT16 Tensor Core

FP8Tensor Core

INT8 Tensor Core

GPU Memory

GPU Memory Bandwidth

Display Connectors

NVENC | NVDEC

Form Factor

System Interconnect

Thermal Solution

Maximum Power Consumption

Server Options