Volta (microarchitecture)

Nvidia Volta
Release dateDecember 7, 2017
CodenameVolta
Fabrication processTSMC 12 nm (FinFET)
Cards
Enthusiast
  • Tesla V100
  • Tesla V100S PCIe
  • Titan V
  • Titan V CEO Edition
  • Quadro GV100
History
PredecessorPascal
VariantTuring (consumer, professional)
SuccessorAmpere (consumer, professional)
Support status
Limited support until October 2025
Security updates until October 2028[1]

Volta is the codename, but not the trademark,[2] for a GPU microarchitecture developed by Nvidia, succeeding Pascal. It was first announced on a roadmap in March 2013,[3] although the first product was not announced until May 2017.[4] The architecture is named after 18th–19th century Italian chemist and physicist Alessandro Volta. It was Nvidia's first chip to feature Tensor Cores, specially designed cores that have superior deep learning performance over regular CUDA cores.[5] The architecture is produced with TSMC's 12 nm FinFET process. The Ampere microarchitecture is the successor to Volta.

The first graphics card to use it was the datacenter Tesla V100, e.g. as part of the Nvidia DGX-1 system.[4] It has also been used in the Quadro GV100 and Titan V. There were no mainstream GeForce graphics cards based on Volta.

After two USPTO proceedings,[6][7] on July 3, 2023 Nvidia lost the Volta trademark application in the field of artificial intelligence. The Volta trademark[8] owner remains Volta Robots, a company specialized in AI and vision algorithms for robots and unmanned vehicles.

Details

Architectural improvements of the Volta architecture include the following:

  • CUDA Compute Capability 7.0
    • concurrent execution of integer and floating point operations
  • TSMC's 12 nm FinFET process,[9] allowing 21.1 billion transistors.[10]
  • High Bandwidth Memory 2 (HBM2),[9][11]
  • NVLink 2.0: a high-bandwidth bus between the CPU and GPU, and between multiple GPUs. Allows much higher transfer speeds than those achievable by using PCI Express; estimated to provide 25 Gbit/s per lane.[12] (Disabled for Titan V)
  • Tensor cores: A tensor core is a unit that multiplies two 4×4 FP16 matrices, and then adds a third FP16 or FP32 matrix to the result by using fused multiply–add operations, and obtains an FP32 result that could be optionally demoted to an FP16 result.[13] Tensor cores are intended to speed up the training of neural networks.[13] Volta's Tensor cores are first generation while Ampere has third generation Tensor cores.[14][15]
  • PureVideo Feature Set I hardware video decoding

Comparison of Compute Capability: GP100 vs GV100 vs GA100[16]

GPU features Nvidia Tesla P100 Nvidia Tesla V100 Nvidia A100
GPU codename GP100 GV100 GA100
GPU architecture Nvidia Pascal Nvidia Volta Nvidia Ampere
Compute capability 6.0 7.0 8.0
Threads / warp 32 32 32
Max warps / SM 64 64 64
Max threads / SM 2048 2048 2048
Max thread blocks / SM 32 32 32
Max 32-bit registers / SM 65536 65536 65536
Max registers / block 65536 65536 65536
Max registers / thread 255 255 255
Max thread block size 1024 1024 1024
FP32 cores / SM 64 64 64
Ratio of SM registers to FP32 cores 1024 1024 1024
Shared Memory Size / SM 64 KB Configurable up to 96 KB Configurable up to 164 KB

Comparison of Precision Support Matrix[17][18]

Supported CUDA Core Precisions Supported Tensor Core Precisions
FP16 FP32 FP64 INT1 INT4 INT8 TF32 BF16 FP16 FP32 FP64 INT1 INT4 INT8 TF32 BF16
Nvidia Tesla P4 No Yes Yes No No Yes No No No No No No No No No No
Nvidia P100 Yes Yes Yes No No No No No No No No No No No No No
Nvidia Volta Yes Yes Yes No No Yes No No Yes No No No No No No No
Nvidia Turing Yes Yes Yes No No No No No Yes No No Yes Yes Yes No No
Nvidia A100 Yes Yes Yes No No Yes No Yes Yes No Yes Yes Yes Yes Yes Yes

Legend:

  • FPnn: floating point with nn bits
  • INTn: integer with n bits
  • INT1: binary
  • TF32: TensorFloat32
  • BF16: bfloat16

Comparison of Decode Performance

Concurrent streams H.264 decode (1080p30) H.265 (HEVC) decode (1080p30) VP9 decode (1080p30)
V100 16 22 22
A100 75 157 108

Products

Volta has been announced as the GPU microarchitecture within the Xavier generation of Tegra SoC focusing on self-driving cars.[19][20]

At Nvidia's annual GPU Technology Conference keynote on May 10, 2017, Nvidia officially announced the Volta microarchitecture along with the Tesla V100.[4] The Volta GV100 GPU is built on a 12 nm process size using HBM2 memory with 900 GB/s of bandwidth.[21]

Nvidia officially announced the Nvidia TITAN V on December 7, 2017.[22][23]

Nvidia officially announced the Quadro GV100 on March 27, 2018.[24]

Model Launch Code Name (s) Fab
(nm)
Transistors
(billion)
Die size
(mm2)
Bus Interface Core config SM
Count[a]
Graphics
Processing
Clusters[b]
L2 Cache
Size (MiB)
Clock speeds Fillrate Memory Processing power (GFLOPS) TDP
(Watts)
NVLink Support Launch Price
(USD)
CUDA
core[c]
Tensor
core[d]
Base core
clock (MHz)
Boost clock
(MHz)
Memory
(MT/s)
Pixel
(GP/s)
Texture
(GT/s)
Size
(GiB)
Bandwidth
(GB/s)
Bus
Type
Bus width
(bit)
Single
precision
(boost)
Double
precision
(boost)
Half
precision
(boost)
MSRP
Nvidia Titan V[25] December 7, 2017 GV100-400-A1 TSMC 12 nm 21.1 815 PCIe 3.0 ×16 5120:320:96 640 80 6 4.5 1200 1455 1700 139.7 465.6 12 652.8 HBM2 3072 12288 (14899) 6144 (7450) 24576 (29798) 250 No $2,999
Nvidia Quadro GV100[26] March 27, 2018 GV100 5120:320:128 6 1132 1628 1696 208.4 521 32 868.4 4096 11592 (16671) 5796 (8335) 23183 (33341) Yes $8,999
Nvidia Titan V CEO Edition[27][28] June 21, 2018 1200 1455 1700 186.2 465.6 870.4 12288 (14899) 6144 (7450) 24576 (29798) N/A
  1. ^ One Streaming Multiprocessor encompasses 64 CUDA cores and 4 TMUs.
  2. ^ One Graphics Processing Cluster encompasses fourteen Streaming Multiprocessors.
  3. ^ CUDA cores : Texture mapping units : Render output units
  4. ^ A Tensor core is a mixed-precision FPU specifically designed for matrix arithmetic.

Application

Volta is also reported to be included in the Summit and Sierra supercomputers, used for GPGPU compute.[29][30] The Volta GPUs will connect to the POWER9 CPUs via NVLink 2.0, which is expected to support cache coherency and therefore improve GPGPU performance.[31][12][32]

V100 accelerator and DGX V100

Comparison of accelerators used in DGX:[33][34][35]

Model Architecture Socket Cores Boost clock
(MHz)
Memory VRAM Single
precision
(FP32; TFLOPS)
Double
precision
(FP64; TFLOPS)
INT8
(non-tensor)
INT8
dense tensor
INT32 FP4
dense tensor
FP16
(TFLOPS)
FP16
dense tensor
bfloat16
dense tensor
TensorFloat-32
(TF32)
dense tensor
FP64
dense tensor
Interconnect
(NVLink; TB/sec)
GPU #SM L1 Cache (KB) L2 Cache
(KB)
TDP
(W)
Die size
(mm2)
Transistor
count
(billion)
Fabrication
Process
Launched
FP32
CUDA
FP64
(excl. tensor)
Mixed
INT32/FP32
INT32 Type
(HBM)
Speed
(Gb/s)
Bus width
(bits)
Bandwidth
(TB/s)
Type
(HBM)
Size
(GB)
Per SM Total
P100 Pascal SXM/SXM2 3584 1792 N/a 1480 HBM2 1.4 4096 0.72 HBM2 16 10.6 5.3 N/a 21.2 N/a 0.16 GP100 56 24 1344 4096 300 610 15.3 TSMC 16FF+ Q2 2016
V100 16GB Volta SXM2 5120 2560 N/a 5120 1530 1.75 0.9 15.7 7.8 62 TOPS N/a 15.7 TOPS N/a 31.4 125 TFLOPS N/a 0.3 GV100 80 128 10240 6144 815 21.1 TSMC 12FFN Q3 2017
V100 32GB SXM3 32 350
A100 40GB Ampere SXM4 6912 3456 6912 N/a 1410 2.4 5120 1.52 40 19.5 9.7 N/a 624 TOPS 19.5 TOPS 78 312 TFLOPS 312 TFLOPS 156 TFLOPS 19.5 TFLOPS 0.6 GA100 108 192 20736 40960 400 826 54.2 TSMC N7 Q1 2020
A100 80GB HBM2e 3.2 HBM2e 80
H100 Hopper SXM5 16896 4608 16896 1980 HBM3 5.2 3.35 HBM3 67 34 1.98 POPS N/a 990 TFLOPS 990 TFLOPS 495 TFLOPS 67 TFLOPS 0.9 GH100 132 192 25344 51200 700 814 80 TSMC 4N Q3 2022
H200 HBM3e 6.3 6144 4.8 HBM3e 141 1000 Q3 2023
B100 Blackwell SXM6 N/a 8 8192 8 192 N/a 3.5 POPS N/a 7 PFLOPS N/a 1.98 PFLOPS 1.98 PFLOPS 989 TFLOPS 30 TFLOPS 1.8 GB100 N/a 700 N/a 208 TSMC 4NP Q4 2024
B200 4.5 POPS 9 PFLOPS 2.25 PFLOPS 2.25 PFLOPS 1.2 PFLOPS 40 TFLOPS 1000

See also

References

  1. ^ Kampman, Jeffrey (2025-07-31). "Nvidia confirms end of Game Ready driver support for Maxwell and Pascal GPUs — affected products will get optimized drivers through October 2025". Tom's Hardware. Retrieved 2025-08-21.
  2. ^ "Nvidia Volta Trademark Status". United_States_Patent_and_Trademark_Office. 14 August 2023. Retrieved 14 August 2023.
  3. ^ Gasior, Geoff (19 March 2013). "Nvidia's Volta GPU to feature on-chip DRAM". The Tech Report. Archived from the original on 1 May 2019. Retrieved 14 March 2017.
  4. ^ a b c Smith, Ryan (2017-05-10). "The NVIDIA GPU Tech Conference 2017 Keynote Live Blog". Archived from the original on May 10, 2017. Retrieved 2018-11-03.
  5. ^ "NVIDIA Volta AI Architecture | NVIDIA". NVIDIA. Retrieved 2018-04-11.
  6. ^ "Volta trademark Cancellation Proceeding". United_States_Patent_and_Trademark_Office.
  7. ^ "Volta trademark Exparte Appeal Proceeding". United_States_Patent_and_Trademark_Office.
  8. ^ "Volta Trademark status". United_States_Patent_and_Trademark_Office.
  9. ^ a b Killian, Zak (14 March 2017). "Report: TSMC set to fabricate Volta and Centriq on 12-nm process". The Tech Report. Archived from the original on 14 March 2017. Retrieved 14 March 2017.
  10. ^ Durant, Luke; Giroux, Olivier; Harris, Mark; Stam, Nick (May 10, 2017). "Inside Volta: The World's Most Advanced Data Center GPU". Nvidia developer blog.
  11. ^ Gasior, Geoff (March 19, 2013). "Nvidia's Volta GPU to feature on-chip DRAM". The Tech Report. Archived from the original on May 1, 2019. Retrieved March 14, 2017.
  12. ^ a b Shah, Agam (22 August 2016). "Nvidia's NVLink 2.0 will first appear in Power9 servers next year". PC World. Retrieved 14 March 2017.
  13. ^ a b Harris, Mark (May 11, 2017). "CUDA 9 Features Revealed: Volta, Cooperative Groups and More". Retrieved August 12, 2017.
  14. ^ "NVIDIA Ampere Architecture In-Depth". 14 May 2020.
  15. ^ "NVIDIA A100 Tensor Core GPU Architecture" (PDF). Retrieved 2023-12-15.
  16. ^ "NVIDIA A100 Tensor Core GPU Architecture: Unprecedented Acceleration at Every Scale" (PDF). Nvidia. Retrieved September 18, 2020.
  17. ^ "NVIDIA Tensor Cores: Versatility for HPC & AI". NVIDIA.
  18. ^ "Abstract". docs.nvidia.com.
  19. ^ Cutress, Ian; Tallis, Billy (4 January 2016). "CES 2017: Nvidia Keynote Liveblog". AnandTech. Archived from the original on January 5, 2017. Retrieved 9 January 2017.
  20. ^ "NVIDIA DRIVE Xavier, World's Most Powerful SoC, Brings Dramatic New AI Capabilities | NVIDIA Blog". The Official NVIDIA Blog. 2018-01-07. Retrieved 2018-11-03.
  21. ^ Smith, Ryan (10 May 2017). "Nvidia Volta Unveiled". AnandTech. Archived from the original on May 11, 2017. Retrieved 2 June 2017.
  22. ^ "NVIDIA TITAN V Transforms the PC into AI Supercomputer".
  23. ^ "Introducing NVIDIA TITAN V: The World's Most Powerful PC Graphics Card".
  24. ^ "NVIDIA Reinvents the Workstation with Real-Time Ray Tracing".
  25. ^ "Introducing NVIDIA TITAN V: The World's Most Powerful PC Graphics Card". NVIDIA. Retrieved 2017-12-08.
  26. ^ "NVIDIA Quadro GV100". Retrieved 2018-03-27.
  27. ^ Smith, Ryan. "NVIDIA Unveils & Gives Away New Limited Edition 32GB Titan V "CEO Edition"". Archived from the original on June 21, 2018. Retrieved 2018-07-06.
  28. ^ "NVIDIA TITAN V CEO Edition". TechPowerUp. Retrieved 2018-07-07.
  29. ^ Shankland, Steven (14 September 2015). "IBM, Nvidia land $325M supercomputer deal". CNET. Retrieved 29 December 2015.
  30. ^ Noyes, Katherine (16 March 2015). "IBM, Nvidia rev HPC engines in next-gen supercomputer push". PC World. Retrieved 29 December 2015.
  31. ^ Smith, Ryan (17 November 2014). "Nvidia Volta, IBM Power9 Land Contracts for New US Government Supercomputers". Anandtech. Retrieved 14 March 2017.{{cite news}}: CS1 maint: deprecated archival service (link)
  32. ^ Lilly, Paul (January 25, 2017). "NVIDIA 12nm FinFET Volta GPU Architecture Reportedly Replacing Pascal In 2017". HotHardware.
  33. ^ Smith, Ryan (March 22, 2022). "NVIDIA Hopper GPU Architecture and H100 Accelerator Announced: Working Smarter and Harder". AnandTech. Archived from the original on September 23, 2023.
  34. ^ Smith, Ryan (May 14, 2020). "NVIDIA Ampere Unleashed: NVIDIA Announces New GPU Architecture, A100 GPU, and Accelerator". AnandTech. Archived from the original on July 29, 2024.
  35. ^ Garreffa, Anthony (September 17, 2017). "NVIDIA Tesla V100 Tested: Near Unbelievable GPU Power". TweakTown.com. Retrieved December 30, 2025.