NVIDIA A40 vs NVIDIA Tesla V100 PCIe

Comparative analysis of NVIDIA A40 and NVIDIA Tesla V100 PCIe videocards for all known characteristics in the following categories: Essentials, Technical info, Video outputs and ports, Compatibility, dimensions and requirements, API support, Memory, Technologies. Benchmark videocards performance analysis: Geekbench - OpenCL, PassMark - G2D Mark, PassMark - G3D Mark.

NVIDIA A40

NVIDIA Tesla V100 PCIe

Differences

Reasons to consider the NVIDIA A40

Videocard is newer: launch date 3 year(s) 3 month(s) later
Around 5% higher core clock speed: 1305 MHz vs 1246 MHz
Around 26% higher boost clock speed: 1740 MHz vs 1380 MHz
1323.8x more texture fill rate: 584.6 GTexel/s vs 441.6 GTexel / s
2.1x more pipelines: 10752 vs 5120
A newer manufacturing process allows for a more powerful, yet cooler running videocard: 8 nm vs 12 nm
3x more maximum memory size: 48 GB vs 16 GB
Around 3% higher memory clock speed: 1812 MHz (14.5 Gbps effective) vs 1758 MHz

Launch date	5 Oct 2020 vs 21 June 2017
Core clock speed	1305 MHz vs 1246 MHz
Boost clock speed	1740 MHz vs 1380 MHz
Texture fill rate	584.6 GTexel/s vs 441.6 GTexel / s
Pipelines	10752 vs 5120
Manufacturing process technology	8 nm vs 12 nm
Maximum memory size	48 GB vs 16 GB
Memory clock speed	1812 MHz (14.5 Gbps effective) vs 1758 MHz

Reasons to consider the NVIDIA Tesla V100 PCIe

Around 20% lower typical power consumption: 250 Watt vs 300 Watt

Thermal Design Power (TDP)	250 Watt vs 300 Watt

Compare benchmarks

GPU 1: NVIDIA A40
GPU 2: NVIDIA Tesla V100 PCIe

Name	NVIDIA A40	NVIDIA Tesla V100 PCIe
Geekbench - OpenCL	193656
PassMark - G2D Mark	627
PassMark - G3D Mark	14665

Compare specifications (specs)

	NVIDIA A40	NVIDIA Tesla V100 PCIe
Essentials
Architecture	Ampere	Volta
Code name	GA102	GV100
Launch date	5 Oct 2020	21 June 2017
Place in performance rating	58	not rated
Type		Desktop
Technical info
Boost clock speed	1740 MHz	1380 MHz
Core clock speed	1305 MHz	1246 MHz
Manufacturing process technology	8 nm	12 nm
Peak Double Precision (FP64) Performance	1169 GFLOPS (1:32)
Peak Half Precision (FP16) Performance	37.42 TFLOPS (1:1)
Peak Single Precision (FP32) Performance	37.42 TFLOPS
Pipelines	10752	5120
Pixel fill rate	194.9 GPixel/s
Texture fill rate	584.6 GTexel/s	441.6 GTexel / s
Thermal Design Power (TDP)	300 Watt	250 Watt
Transistor count	28300 million	21,100 million
Floating-point performance		14,131 gflops
Video outputs and ports
Display Connectors	3x DisplayPort	No outputs
Compatibility, dimensions and requirements
Form factor	Dual-slot
Interface	PCIe 4.0 x16	PCIe 3.0 x16
Length	267 mm (10.5 inches)
Recommended system power (PSU)	700 Watt
Supplementary power connectors	8-pin EPS	2x 8-pin
Width	112 mm (4.4 inches)
API support
DirectX	12.2
OpenCL	3.0
OpenGL	4.6
Shader Model	6.6
Vulkan
Memory
Maximum RAM amount	48 GB	16 GB
Memory bandwidth	695.8 GB/s	900.1 GB / s
Memory bus width	384 bit	4096 Bit
Memory clock speed	1812 MHz (14.5 Gbps effective)	1758 MHz
Memory type	GDDR6	HBM2
Technologies
CUDA