1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Graphics Pascal and Polaris Discussion Thread

Discussion in 'Hardware' started by Parge, 25 Aug 2015.

  1. Parge

    Parge the worst Super Moderator

    Joined:
    16 Jul 2010
    Posts:
    12,927
    Likes Received:
    562
    So I thought it would be good to kick things off nice an early with a discussion about Pascal. I suspect we'll hear quite a bit more about it as it we close in on Q4.

    So, what do we know so far? People are expecting a lot, and the reasons they give seem to come down to three major changes from Maxwell.

    • A move to 16nm.
    • HBM2 stacked memory.
    • A new architecture

    Hopefully with HBM and possibly smaller die size we'll see the size of standard cards drop to less than 10".

    Deploy opinions now! :thumb:
     
    Last edited by a moderator: 20 Jun 2016
  2. Deders

    Deders Well-Known Member

    Joined:
    14 Nov 2010
    Posts:
    4,053
    Likes Received:
    106
    Really hope it's not too late in the design to be optimized for parallel processing, like GNC and unlike Maxwell for DX12 performance.
     
  3. Guest-16

    Guest-16 Guest

    The GP100/104 will be TSMC 16nm FF+ process. NV did investigate using Samsung 14nm FF but according to recent Korean business news seems Sammy yields weren't good enough. Apparently NV stated at least 85% yields and Sam can't provide. Likely due to the fact they don't have experience on large dies nor interposers, but, that doesn't mean NV won't use them for following GP106/108 that won't use HBM/Interposer due to cost / component availability advantages of older tech. Interposers are very expensive vs the usual organic substrate.

    More likely HBM2 than HMC. Nvidia will have done as much R&D as AMD has done (expressed at HotChips) and both are using TSMC. Since AMD has proven the tech TSMC will resell it to NV.

    I would expect new GP104 cards Q1 next year. It gives the very successful 980 over a year selling. GP100 will follow AMD's high-end release as a counter to continue market domination. Given AMD's R&D budget and recent execution delays I suspect their 16nm will follow in Q316. NV will be under no pressure and their market share will only increase, which is bad for everyone (except NV investors).

    The problems Intel is having at 14nm is not related to TSMC/Sam, who are using a slightly different FF process.
     
    Last edited by a moderator: 26 Aug 2015
  4. Harlequin

    Harlequin Well-Known Member

    Joined:
    4 Jun 2004
    Posts:
    7,071
    Likes Received:
    179
    if its Q1 then no HBM2 for pascal as Samsung haven't started HBM2 in vaolume yet so that leaves Hynix , who have a supply agreement with AMD.
     
  5. Parge

    Parge the worst Super Moderator

    Joined:
    16 Jul 2010
    Posts:
    12,927
    Likes Received:
    562
    As above, its TSMC.
     
  6. Harlequin

    Harlequin Well-Known Member

    Joined:
    4 Jun 2004
    Posts:
    7,071
    Likes Received:
    179
    TSMC doesn't make ram

    edit:

    TSMC doesn't make 14nm FF HBM - they are on 40nm dram for embedded
     
  7. edzieba

    edzieba Virtual Realist

    Joined:
    14 Jan 2009
    Posts:
    3,909
    Likes Received:
    590
    AMD has preference on SK Hynix' HBM production, but unless their demand outstrips production, there will be production surplus available to other vendors. We have seen HBM1 applied to GPUs, but HBM is not limited to this. A single HBM module paired with an SoC on an interposer makes for a very compact package with a reduction in the thermal issues that have been plaguing recent stacked packages (both from larger dissipation area and lower power memory).
     
  8. Guest-16

    Guest-16 Guest

    Samsung has committed to mass production of HBM starting 2016. Their focus is HPC/GPU with 8gbit modules, so HBM2.
    http://news.softpedia.com/news/sams...m-tech-starting-with-2016-489813.shtml#sgal_0

    They did not commit to a process node - I strongly doubt it will be 14FF, more likely 2xnm or high 1x for yield benefits.

    AMD does not have an exclusive agreement with Hynix.

    HBM is not applicable to all products though, for example Intel uses HMC on Xeon Phi.
     
    Last edited by a moderator: 26 Aug 2015
  9. Harlequin

    Harlequin Well-Known Member

    Joined:
    4 Jun 2004
    Posts:
    7,071
    Likes Received:
    179
    you cant start volume production of a new memory type - and get it on retail products the same quarter

    edit:

    http://hexus.net/tech/news/graphics/84662-amd-said-secured-priority-access-sk-hynixs-hbm2-chips/

    as I said they have a priority supply agreement with AMD.

    Pascal will be later than Nv want

    edit 2

    http://wccftech.com/samsung-enters-hbm-market-1h-2016-hpc-gpu-ready-hbm-15-tbs-bandwidth-48-gb-vram/

    hmmm wccf are aiming for a H1 2016 for Samsung , if the other reports of Q1 are correct - it could be end march for accuaracy for both.
     
    Last edited: 26 Aug 2015
  10. Guest-16

    Guest-16 Guest

    Nvidia will also likely have priority from any vendor due to larger order size. Typically all GPU vendors launch new high-end in the 1000s of pieces/batch range, then increase it rapidly as they can get more parts made.

    Q1 is typically a better selling period than Q2, so everyone will be pushing for early as poss.
     
  11. Harlequin

    Harlequin Well-Known Member

    Joined:
    4 Jun 2004
    Posts:
    7,071
    Likes Received:
    179
    being honest I cant really see pascal being in retail before this time next year - tsmc yields being the first concern , then HBM2 production. If the rumours of tape out last month are true , IIRC it now takes between 9 months and a year from tape out to retail, problems not withstanding

    throw in if GP100 needs a respin as well....
     
  12. edzieba

    edzieba Virtual Realist

    Joined:
    14 Jan 2009
    Posts:
    3,909
    Likes Received:
    590
    With many of Pascal's features being stepped back to Volta, My guess is that Pascal will implement HBM2 in a similar way AMD have with GCN: take the existing core architecture with minimal modification (in this case, Maxwell 2), and swap out the memory controller.
    The wrinkle is NVLink: it's not slated for any sort of consumer CPU-GPU linking (only for HPC with Power CPUs), but it is slated for GPU-GPU links on x86. Whether that will ever filter down to consumer cards is unknown, but it implies that Pascal will either have two different external interconnect controllers, or will be NVLink only and have an external translator chip. Could make for some interesting cards if the GPU is a module (or multiple modules) of the size the preview device shown is, that sits on a PCIe translator card.
     
  13. Guest-16

    Guest-16 Guest

    This.

    NVLink won't feature on GP104s, which will launch first, so time to market is faster. GP100s will come later, however since HPC is Nvidia's big buck growth market they'll be pushing that as well.

    NVLink won't feature on consumer products. Intel dont give no shitz.
     
  14. Parge

    Parge the worst Super Moderator

    Joined:
    16 Jul 2010
    Posts:
    12,927
    Likes Received:
    562
    What will NVLink require from motherboard manufacturers? (if anything).

    Also, can anyone ELI5? I've done a bit of research but still can't quite nail what the selling point is?
     
  15. Corky42

    Corky42 Where's walle?

    Joined:
    30 Oct 2012
    Posts:
    9,648
    Likes Received:
    386
    So would that rule out (consumer) Pascal making the switch to handling workloads in parallel or just the interconnect?

    AFAIK it's just another type of interconnect like PCI-E if we are talking about motherboard manufacturers supporting it, although I'm unclear if the parallel nature of the way it works would carry over to PCI-E and consumer products.
     
    Last edited: 28 Aug 2015
  16. edzieba

    edzieba Virtual Realist

    Joined:
    14 Jan 2009
    Posts:
    3,909
    Likes Received:
    590
    There's nothing technically to stop Pascal from using NVLink for GPU-GPU communication (i.e. replacing the SLI bridge), with the GPU-CPU bus remaining PCIe.

    The problem is physical form factor, and GPU die area. If the GPU die has both PCIe and NVLink interconnects, then that's a whole bunch of die area you can't use for the actual number-crunching gubbins. If you have just NVLink on die, and sit an NVLink-PCIe translator chip on the board, then that's an extra fixed cost added to all consumer cards.

    x86 CPU-GPU NVLink is very, VERY unlikely to happen. It would require building an NVLink controller onto the CPU die, and Intel are unlikely to give up die area for a proprietary single-purpose link. The only possibility would be some bizarre multi-lane translator chip on-motherboard that takes 20x PCIe lanes (NVLink block 20GB/s, PCIe 3.0 lane 1GB/s) to interface to one NVLink block, and each GPU may support several NVLink blocks. Again, that's an extra fixed cost to slap onto the motherboard, assumes a whole bunch more PCIe lanes available on the CPU (think -EX and -EP CPUs only) and loses a bunch of PCIe slots from other use on the board.
    If NVLink were to suddenly become an open standard with FRANd or fee-free patent licensing, then maybe it might gain traction with Intel and AMD. But that doesn't seem likely.

    Then there's the physical interconnect issue. NVLink is slated to use big chunky mezzanine connectors, of the sort you generally only find in workstations to add in extra CPU riser boards. If NVLink doesn't also have a PHY card-edge interface, this would also make it difficult to integrate onto consumer motherboards without a radical form factor redesign. This would also complicate inter-card usage of NVLink.


    My bet is as Bindibadgi said: NVLink probably won't crop up on at least the first release of consumer cards. It might appear as a GPU-GPU interconnect for consumer cards at a later date. The chances of it being used as a CPU-GPU interconnect for consumer (x86, CPU from Intel or AMD) cards is so close to zero to be effectively discounted.
     
  17. Corky42

    Corky42 Where's walle?

    Joined:
    30 Oct 2012
    Posts:
    9,648
    Likes Received:
    386
    So that rules out adding the interconnect to consumer grade products, what about Pascal handling workloads from the CPU in parallel, in a similar fashion as GCN does it.
     
  18. edzieba

    edzieba Virtual Realist

    Joined:
    14 Jan 2009
    Posts:
    3,909
    Likes Received:
    590
    Nobody outside of Nvidia (or very close partners with hefty NDAs) know the details of Pascal's architecture. Could be Maxwell 2 with some uncore stuff swapped out to support HBM2[1] and NVLink, could be a modification of Maxwell 2 for less locking on job completion[2], could be all new.

    [1] Fiji took GCN and swapped out the GDDR5 interface with the HBM1 interface, as well as some process improvements to take advantage of a lower junction temperature (the purpose of the CLC for the Fury XT).

    [2] Maxwell 2 does better than past architectures, but Nvidia has been optimising for serial job dispatch for some time. Mainly because there was no API doing parallel job dispatch. Conversely, GCN is optimised for parallel dispatch, but with the downside that anything available now (and most games available in the near future, as DX11.3 and on will continue to be an updated codepath) will suffer for it. For Nvidia, the bottleneck is in job handling on the GPU; serial dispatch means that the job handling hardware doesn't deal well with locking or stalling between jobs. Whether a new job handler can be glommed into Maxwell 2 or not is something only Nvidia's engineers could answer. They might even want to just beef up the 'legacy' serial job handler to just overpower locking in parallel dispatch simply to not lose performance for serial dispatch.
     
    Corky42 likes this.
  19. Corky42

    Corky42 Where's walle?

    Joined:
    30 Oct 2012
    Posts:
    9,648
    Likes Received:
    386
    Thanks for a great answer edzieba. :)

    I guess we're going to have to wait, although I dare say something like how Pascal handles job dispatch (parallel vs serial) is information that Nvidia may not be very forthcoming with.
     
  20. Guest-16

    Guest-16 Guest


    This^^

    NVLink outside of IBM boxes is a big unknown (I haven't read about it recently - any news??). It's like QPI or HTT - it adds die space and IO space and motherboard space. That's ALL COST. It could in theory replace the SLI bridge (I don't know the IO structure though), but the SLI bridge as now does such little workload these days that I doubt it would really be that useful. It COULD feature in Quaddro/Tesla cards in an NVLINK (SLI-like) bridge as card-to-card interlink, bypassing PCIE. Then it would be invisible to the motherboard and Intel's domain, but inter-node communication would still be as 'slow' as regular boxes. Mezzanine on professionally built systems is doable, consumer is not.

    It MAY be a feature of SLI certification if Nvidia feels it applies/makes a significant enough performance advantage, but I suspect it will add so much to the cost of motherboards that manuf. may bork for anything but the very high-end (like when NF200 chips were added). On reflection I really don't think it'll be anywhere on motherboard. Card to card is far more likely. Depends on several business pressure points.

    Intel will definitely not want it on anything Xeon-related. It's a competing IO so that's why NV are working with IBM. Certainly no consumer x86 part will see it as there's no business reason to include it (yet).

    @edzieba - After reading what you wrote it makes sense that Pascal's focus - from a risk/ROI thought - could have a modified front end thread distribution engine for parallel dispatch (and a ****-ton of driver work), along with modified uncore for HBM, and a ton of packaging technology learning for interposers but that's mostly TSMC directing. That alone is a LOT of work, esp. on front end which is done iteratively elsewhere. Maxwell cores are energy efficient enough to retain leadership through another generation.

    Nvidia had committed to 'mobile first' development strategy, but now it's only mobile is automotive. So I wonder if Nvidia has pivoted back to its original performance-first strategy. Typically it would have moved heaven and earth for 'new-DirectX' domination as it's a huge, huge selling point and financially their biggest revenue generating BU will be under great pressure to retain that 80% market share. Right now they're on the backfoot thanks to AMD's win with Mantle/Vulcan. [I want Pascal to be interesting but for the benefit of the market I hope AMD pulls back 10-15pts marketshare.]

    As for CPU-GPU and HSA: there's also the far out 'AMD splits and sells' option. This won't affect Pascal, but if AMD's tech/IP comes up for grabs.. well.. who knows what 2017 onwards will hold (actually the Chinese will bid heavily for it so you won't see it :p)

    I truly love discussions like this! Great thread.
     
    Last edited by a moderator: 31 Aug 2015

Share This Page