• NVIDIA-T4-1.png
  • NVIDIA-T4-2.png
  • NVIDIA-T4-3.png
  • NVIDIA-T4-4.png




Where to Buy Request a Quote

  • Description


    Next-Level Acceleration Has Arrived

    The artificial intelligence revolution surges forward, igniting opportunities for businesses to reimagine how they solve their customers’ challenges. We’re racing toward a future where every customer interaction, every product, every service offering will be touched and improved by AI. And making that future a reality requires a computing platform that can accelerate the full diversity of modern AI, enabling businesses to re-envision how they meet—and exceed—customer demands and cost-effectively scale their AI-based products and services.

    The NVIDIA T4 GPU is among the world’s most powerful universal inference accelerators. Powered by NVIDIA Turing Tensor Cores, T4 provides revolutionary multi-precision inference performance to accelerate the diverse applications of modern AI. T4 is a part of the NVIDIA AI inference platform that supports all AI frameworks and provides comprehensive tooling and integrations to drastically simplify the development and deployment of advanced AI.


    GPU Architecture NVIDIA Turing
    Turing Tensor Cores 320
    NVIDIA CUDA Cores 2560
    Peak FP32 8.1 TFLOPS
    Mixed Precision | FP16/FP32 65 TFLOPS
    INT8 130 TOPS
    INT4 260 TOPS
    GPU Memory 16 GB GDDR6
    Memory Bandwidth 300 GB/s
    Thermal Solution Passive
    Maximum Power Consumption 70 W
    System Interface PCIe Gen 3.0 x16
    Compute APIs CUDA | NVIDIA TensorRT | ONYX

    Turing Tensor Cores: The Heart of Universal Inference Acceleration

    AI is evolving rapidly. In the past few years alone, a Cambrian explosion of neural network types has seen the emergence of convolutional neural networks (CNNs), recurrent neural networks (RNNs), generative adversarial networks (GANs), reinforcement learning (RL), and hybrid network architectures. Accelerating these diverse models requires both high performance and programmability.

    NVIDIA T4 introduces the revolutionary Turing Tensor Core technology with multi-precision computing for AI inference. Powering breakthrough performance from FP32 to FP16 to INT8, as well as INT4 and binary precisions, T4 delivers dramatically higher performance than CPUs.

    Developers can unleash the power of Turing Tensor Cores directly through NVIDIA TensorRT, software libraries and integrations with all AI frameworks. These tools let developers target optimal precision for different AI applications, achieving dramatic performance gains without compromising accuracy of results.

    State-of-the-art Inference in Real-Time

    Responsiveness is key to user engagement for services such as conversational AI, recommender systems, and visual search. As models increase in accuracy and complexity, delivering the right answer right now requires exponentially larger compute capability.

    NVIDIA T4 features multi-process service (MPS) with hardware-accelerated work distribution. MPS reduces latency for processing requests, and enables multiple independent requests to be simultaneously processed, resulting in higher throughput and efficient utilization of GPUs.

    Twice the Video Decode Performance

    Video continues on its explosive growth trajectory, comprising over two-thirds of all Internet traffic. Accurate video interpretation through AI is driving the most relevant content recommendations, finding the impact of brand placements in sports events, and delivering perception capabilities to autonomous vehicles, among other usages.

    NVIDIA T4 delivers breakthrough performance for AI video applications, with dedicated hardware transcoding engines that bring twice the decoding performance of prior-generation GPUs. T4 can decode up to 38 full-HD video streams, making it easy to integrate scalable deep learning into the video pipeline to deliver innovative, smart video services. It features performance and efficiency modes to enable either fast encoding or the lowest bit-rate encoding without losing video quality.

    Industry’s Most Comprehensive AI Inference Platform

    AI has crossed the chasm and is rapidly moving from early adoption by pioneers to broader use across industries and large-scale production deployments. Powered by the flexible NVIDIA CUDA development environment and a mature ecosystem with over 1M developers, NVIDIA AI Platform has been evolving for over a decade to offer comprehensive tooling and integrations to simplify the development and deployment of AI.

    NVIDIA TensorRT enables optimization of trained models to efficiently run inference on GPUs. NVIDIA ATTIS and Kubernetes on NVIDIA GPUs streamline the deployment and scaling of AI-powered applications on GPU-accelerated infrastructure for inference. Libraries like cuDNN, cuSPARSE, CUTLASS, and DeepStream accelerate key neural network functions and use cases, like video transcoding. And workflow integrations with all AI frameworks freely available from NVIDIA GPU Cloud containers enable developers to transparently harness the innovations in GPU computing for end-to- end AI workflows, from training neural networks to running inference in production applications.


    3-Year Limited Warranty

    Dedicated NVIDIA professional products Field Application Engineers