NVIDIA A40 Data Center GPU 48GB
The NVIDIA A40 GPU is an evolutionary leap in performance and multi-workload capabilities from the data center, combining best-in-class professional graphics with powerful compute and AI acceleration to meet today’s design, creative, and scientific challenges. Driving the next generation of virtual workstations and server-based workloads, NVIDIA A40 brings state-of-the-art features for ray-traced rendering, simulation, virtual production, and more to professionals anytime, anywhere.
Specifications
GPU Features | NVIDIA A40 |
GPU Memory | 48GB GDDR6 with ECC |
Memory bandwidth | 696 GB/s |
CUDA Cores | 10,752 |
Tensor Cores | 336 |
RT Cores | 84 |
Display Ports | 3x DisplayPort 1.4**; Supports NVIDIA Mosaic and Quadro® Sync4 |
Max Power Consumption | 300 W |
Graphics Bus | PCI Express Gen 4 x 16 |
Form Factor | 4.4” (H) x 10.5” (L) dual slot |
Thermal | Passive |
NVLink | 2-way low profile (2-slot) |
vGPU Software Support | NVIDIA Virtual PC, NVIDIA Virtual Applications, NVIDIA RTX Virtual Workstation, NVIDIA Virtual Compute Server, NVIDIA AI Enterprise |
vGPU Profiles Supported | See the Virtual GPU Licensing Guide |
Compute APIs | CUDA, DirectCompute, OpenCL™, OpenACC® |
Graphics APIs | DirectX 12.075, Shader Model 5.175,OpenGL 4.686, Vulkan 1.18 |
Powered by the NVIDIA Ampere Architecture
NVIDIA Ampere Architecture CUDA® Cores
Double-speed processing for single-precision floating point (FP32) operations and improved power efficiency provide significant performance improvements for graphics and simulation workflows, such as complex 3D computer-aided design (CAD) and computer-aided engineering (CAE).
Second-Generation RT Cores
With up to 2X the throughput over the previous generation and the ability to concurrently run ray tracing with either shading or denoising capabilities, second-generation RT Cores deliver massive speedups for workloads like photorealistic rendering of movie content, architectural design evaluations, and virtual prototyping of product designs. This technology also speeds up the rendering of ray-traced motion blur for faster results with greater visual accuracy.
Third-Generation Tensor Cores
New Tensor Float 32 (TF32) precision provides up to 5X the training throughput over the previous generation to accelerate AI and data science model training without requiring any code changes. Hardware support for structural sparsity doubles the throughput for inferencing. Tensor Cores also bring AI to graphics with capabilities like DLSS, AI denoising, and enhanced editing for select applications.
48GB of GPU Memory
Ultra-fast GDDR6 memory, scalable up to 96GB with NVLink, gives data scientists, engineers, and creative professionals the large memory necessary to work with massive datasets and workloads like data science and simulation.
Virtualization-Ready
Next-generation improvements with NVIDIA virtual GPU (vGPU) software allow for larger, more powerful virtual workstation instances for remote users, enabling high-end remote design, AI, and compute workloads.Third-Gen NVIDIA NVLink
Connect two A40 GPUs together to scale from 48GB of GPU memory to 96GB. Increased GPU-to-GPU interconnect bandwidth provides a single scalable memory to accelerate graphics and compute workloads and tackle larger datasets. A new, more compact NVLink connector enables functionality in a wider range of servers.