Nvidia Begins Seeding Reviewers With the New TITAN-X with Pascal – Synthetic CuDNN Benchmarks Show Up To 200% Performance Increase

Usman Pirzada

Synthetic benchmarks for the Nvidia TITAN X have leaked out on Chiphell (via Videocardz) and show a very significant performance increase over the older generation Geforce GTX TITAN X. This of course also means that Nvidia has started sampling reviewers with units (or at least the initial batch has begun shipping). The brand new TITAN X has 3584 CUDA cores and clocked at 1531 Mhz (Boost) offers a significant hardware upgrade over its older brother which boasted just 3072 CUDA cores and a measly clock speed of 1075 Mhz (Boost).

The new TITAN X starts shipping to select reviewers - offers anywhere from a 63% to a 200% speedup with CuDNN5

Since Nvidia has deliberately dropped the Geforce GTX branding from the new TITAN X, instead branding it as a 'prosumer' card, synthetic benches that focus on CuDNN are pretty relevant for this product. The new TITAN X offers a speedup of up to 2x in some cases - which is pretty huge. Keep in mind however, that the new TITAN X is using a newer implementation of CuDNN so the speedup is not simply hw gains. That said, gains from version differences are usually quite small in nature so the higher clock speed as well as the increased amount of cores can be safely said to account for the vast majority of the speedup.

nvidia-geforce-gtx-titan-x-pascal-6
cudnn4vs51b

On paper the new Titan has 16% more cores and a 42% higher clock rate. This means that right off the bat (and not accounting for any improved CuDNN library optimization and or architectural gains) you are looking at a raw compute increase of 66%. If you subtract that figure from the speedup gained, you will arrive at the true architectural gains (Maxwell to Pascal) and the gains from using CuDNN5 instead of CuDNN4. In some cases, these true gains are significant in some cases however, they are actually a bit under what we would expect the bare minimum speedup to be.

According to the benchmarks you are looking at a speedup of between 74% to 91% on Alexnet, 76% to 200% on OverFeat, 74% to 884% on Inception and 91% to 98% on VGG. This means that in compute work that involves deep neural nets the new TITAN X will offer a significant speedup. It has better power efficiency as well. However, when we talk about the cost, the MSRP is going to be 1200 dollars and considering TITANs rarely sell at MSRP (the old $1000 MSRP Titan is retailing in the $1700-1900 range on Amazon and Newegg) the price might give a potential customer pause. Also considering we don't know anything about DP just yet, it would make more sense for the prosumer market to pursue more value oriented purchases,like lets say, multiple GTX 1070s.

NVIDIA GeForce 10 Pascal Family

Graphics Card Name NVIDIA GeForce GTX 1050 2 GBNVIDIA GeForce GTX 1050 3 GBNVIDIA GeForce GTX 1050 TiNVIDIA GeForce GTX 1060 3 GBNVIDIA GeForce GTX 1060 5 GBNVIDIA GeForce GTX 1060 6 GBNVIDIA GeForce GTX 1070NVIDIA GeForce GTX 1070 TiNVIDIA GeForce GTX 1080NVIDIA Titan XNVIDIA GeForce GTX 1080 TiNVIDIA Titan Xp
Graphics CoreGP107GP107GP107GP106 / GP104GP106GP106 / GP104GP104GP104GP104GP102GP102GP102
Process Node14nm FinFET14nm FinFET14nm FinFET16nm FinFET16nm FinFET16nm FinFET16nm FinFET16nm FinFET16nm FinFET16nm FinFET16nm FinFET16nm FinFET
Die Size132mm2132mm2132mm2200mm2200mm2200mm2314mm2314mm2314mm2471mm2471mm2471mm2
Transistors3.3 Billion3.3 Billion3.3 Billion4.4 Billion4.4 Billion4.4 Billion7.2 Billion7.2 Billion7.2 Billion12 Billion12 Billion12 Billion
CUDA Cores640 CUDA Cores768 CUDA Cores768 CUDA Cores1152 CUDA Cores1280 CUDA Cores1280 CUDA Cores1920 CUDA Cores2432 CUDA Cores2560 CUDA Cores3584 CUDA Cores3584 CUDA Cores3840 CUDA Cores
Base Clock1354 MHz1392 MHz1290 MHz1506 MHz1506 MHz1506 MHz1506 MHz1607 MHz1607 MHz1417 MHz1480 MHz1480 MHz
Boost Clock1455 MHz1518 MHz1392 MHz1708 MHz1708 MHz1708 MHz1683 MHz1683 MHz1733 MHz1530 MHz1583 MHz1582
FP32 Compute1.8 TFLOPs2,3 TFLOPs2.1 TFLOPs4.0 TFLOPs4.4 TFLOPs4.4 TFLOPs6.5 TFLOPs8.1 TFLOPs9.0 TFLOPs11 TFLOPs11.5 TFLOPs12.5 TFLOPs
VRAM2 GB GDDR53 GB GDDR54 GB GDDR53 GB GDDR56 GB GDDR56 GB GDDR5/X8 GB GDDR5/X8 GB GDDR58 GB GDDR5X12 GB GDDR5X11 GB GDDR5X12 GB GDDR5X
Memory Speed7 Gbps7 Gbps7 Gbps8 Gbps8 Gbps9 Gbps / 10 Gbps8 Gbps8 Gbps11 Gbps10 Gbps11 Gbps11.4 Gbps
Memory Bandwidth112 GB/s84 GB/s112 GB/s192 GB/s160 GB/s224 GB/s / 240 GB/s256 GB/s256 GB/s352 GB/s480 GB/s484 GB/s547 GB/s
Bus Interface 128-bit bus96-bit bus128-bit bus192-bit bus160-bit bus192-bit bus256-bit bus256-bit bus256-bit bus384-bit bus352-bit bus384-bit bus
Power ConnectorNoneNoneNoneSingle 6-Pin PowerSingle 6-Pin PowerSingle 6-Pin PowerSingle 8-Pin PowerSingle 8-Pin PowerSingle 8-Pin Power8+6 Pin Power8+6 Pin Power8+6 Pin Power
TDP75W75W75W120W120W120W150W180W180W250W250W250W
Display Outputs1x Display Port 1.4
1x HDMI 2.0b
1x DVI
1x Display Port 1.4
1x HDMI 2.0b
1x DVI
1x Display Port 1.4
1x HDMI 2.0b
1x DVI
3x Display Port 1.4
1x HDMI 2.0b
1x DVI
3x Display Port 1.4
1x HDMI 2.0b
1x DVI
3x Display Port 1.4
1x HDMI 2.0b
1x DVI
3x Display Port 1.4
1x HDMI 2.0b
1x DVI
3x Display Port 1.4
1x HDMI 2.0b
1x DVI
3x Display Port 1.4
1x HDMI 2.0b
1x DVI
3x Display Port 1.4
1x HDMI 2.0b
1x DVI
3x Display Port 1.4
1x HDMI 2.0b
3x Display Port 1.4
1x HDMI 2.0b
Launch DateOctober 2016May 2018October 2016September 2016August 2018July 2016June 2016October 2017May 2016August 2016March 2017April 2017
Launch Price$109 US$119 US-$129 US$139 US$199 USTBD$249 US$349 US$449 US$499 US$1200 US$699 US$1200 US

 

Share this story

Deal of the Day

Comments