NVIDIA A40 vs NVIDIA Tesla V100 PCIe

Comparative analysis of NVIDIA A40 and NVIDIA Tesla V100 PCIe videocards for all known characteristics in the following categories: Essentials, Technical info, Video outputs and ports, Compatibility, dimensions and requirements, API support, Memory, Technologies. Benchmark videocards performance analysis: Geekbench - OpenCL, PassMark - G2D Mark, PassMark - G3D Mark.

 

Differences

Reasons to consider the NVIDIA A40

  • Videocard is newer: launch date 3 year(s) 3 month(s) later
  • Around 5% higher core clock speed: 1305 MHz vs 1246 MHz
  • Around 26% higher boost clock speed: 1740 MHz vs 1380 MHz
  • 1323.8x more texture fill rate: 584.6 GTexel/s vs 441.6 GTexel / s
  • 2.1x more pipelines: 10752 vs 5120
  • A newer manufacturing process allows for a more powerful, yet cooler running videocard: 8 nm vs 12 nm
  • 3x more maximum memory size: 48 GB vs 16 GB
  • Around 3% higher memory clock speed: 1812 MHz (14.5 Gbps effective) vs 1758 MHz
Launch date 5 Oct 2020 vs 21 June 2017
Core clock speed 1305 MHz vs 1246 MHz
Boost clock speed 1740 MHz vs 1380 MHz
Texture fill rate 584.6 GTexel/s vs 441.6 GTexel / s
Pipelines 10752 vs 5120
Manufacturing process technology 8 nm vs 12 nm
Maximum memory size 48 GB vs 16 GB
Memory clock speed 1812 MHz (14.5 Gbps effective) vs 1758 MHz

Reasons to consider the NVIDIA Tesla V100 PCIe

  • Around 20% lower typical power consumption: 250 Watt vs 300 Watt
Thermal Design Power (TDP) 250 Watt vs 300 Watt

Compare benchmarks

GPU 1: NVIDIA A40
GPU 2: NVIDIA Tesla V100 PCIe

Name NVIDIA A40 NVIDIA Tesla V100 PCIe
Geekbench - OpenCL 193429
PassMark - G2D Mark 627
PassMark - G3D Mark 14665

Compare specifications (specs)

NVIDIA A40 NVIDIA Tesla V100 PCIe

Essentials

Architecture Ampere Volta
Code name GA102 GV100
Launch date 5 Oct 2020 21 June 2017
Place in performance rating 55 not rated
Type Desktop

Technical info

Boost clock speed 1740 MHz 1380 MHz
Core clock speed 1305 MHz 1246 MHz
Manufacturing process technology 8 nm 12 nm
Peak Double Precision (FP64) Performance 1169 GFLOPS (1:32)
Peak Half Precision (FP16) Performance 37.42 TFLOPS (1:1)
Peak Single Precision (FP32) Performance 37.42 TFLOPS
Pipelines 10752 5120
Pixel fill rate 194.9 GPixel/s
Texture fill rate 584.6 GTexel/s 441.6 GTexel / s
Thermal Design Power (TDP) 300 Watt 250 Watt
Transistor count 28300 million 21,100 million
Floating-point performance 14,131 gflops

Video outputs and ports

Display Connectors 3x DisplayPort No outputs

Compatibility, dimensions and requirements

Form factor Dual-slot
Interface PCIe 4.0 x16 PCIe 3.0 x16
Length 267 mm (10.5 inches)
Recommended system power (PSU) 700 Watt
Supplementary power connectors 8-pin EPS 2x 8-pin
Width 112 mm (4.4 inches)

API support

DirectX 12.2
OpenCL 3.0
OpenGL 4.6
Shader Model 6.6
Vulkan

Memory

Maximum RAM amount 48 GB 16 GB
Memory bandwidth 695.8 GB/s 900.1 GB / s
Memory bus width 384 bit 4096 Bit
Memory clock speed 1812 MHz (14.5 Gbps effective) 1758 MHz
Memory type GDDR6 HBM2

Technologies

CUDA