Discussion in 'Article Discussion' started by bit-tech, 13 Aug 2018.
Ohh how the tables have turned
Anandtech review is the one to read in that he's looked into why it performs like it does. Essentially the 32 core is super niche as the interconnect isn't good enough so it's getting throttled every time it has to access main memory. It only shines when the test fits in the cpu cache. 16 core is great however.
Anandtech are still writing their review. Rendering, hashing and some AI stuff seems to be doing well on it. phoronix did a great article.
It's not really the interconnect that isn't good enough, it's that the socket can't fully utilise all the cores.
Where software can exploit it properly it is one insanely sexy cpu though...
But yeah, over all there is a lot of stuff that simply doesn't scale all that well and that problem will only get more pronounced next year with the 64 Core Epyc 2, which raises serious questions about how long the core war can go on.
Any way, would be interesting to see a comparison between 32 core Ripper and 32 core Epyc with matched clocks to find more details on the impact of the crippled memory access of Ripper, but the chance of a reviewer getting access to the kit required for that is probably next to zero.
There’s one guy that I know of who can... and I’m expecting a review soon.
To be fair most 'normal' PC users struggle to use 6-8 cores let alone 12+, Epyc and Threadripper isn't really meant for the likes of me and, I'm guessing, you.
For those who can put lots of cores to good use though the more and the cheaper the better, some software makes good use of them and with any luck those advantages will slowly trickle down to us mere mortals eventually.
There's been loads of times in the past when computer 'experts' believed X was more than enough only for software to far exceed X a few years later.
I'm not ruling out another core war further down the line (lets just say in a decade), but currently the list of software capable of taking advantage of the cores on offer is getting shorter and shorter each time they add more cores which means they have to bring something else in between, no?
On page 2, under Test System, did you mean 4 x 8GB instead of 2 x 8GB?
I did indeed, fixed!
To some extent yes, but what we saw today wasn't just poor software scaling, it's more to do with the 2990WX's inner workings and limitations. That's why the Core i9-7980XE and 2950X were quicker in some content creation tests. The fact that two dies/four CCX/16 cores in the 2990WX don't have direct access to the PCIe/Memory bus means there's added latency. As Anantech mentions, this isn't too much of an issue in software that isn't particularly memory-reliant, but where it is (content creation, games etc), you'll start bottlenecking and losing performance - even performing worse than the standard 16 core. Sure, there's not perfect scaling even with far fewer cores - the 18 cores in the 7980XE for example isn't 80% quicker in HandBrake than the 7900X despite having 80% more cores. However, the way it's designed still means it's a beast and the fastest HEDT CPU in a lot of tests and is extremely quick in all of them.
While the inner workings of quad module Ripper certainly have a large impact we see scaling limitations beyond 16 cores even when comparing 16 core Epyc against 32 core Epyc in the absolute best case scenario (far away from consumer software):
And that 7351P 16 core Epyc even has a 300 mhz boost clock deficit.
I simply see no reason to believe that that problem won't get more pronounced when they double the core count next year to 64 and where is the path beyond that? Surely it can't be even more cores (at least for a couple years).
The issue shouldn’t get worse, in theory, anyway. It will still be four CCXs, but it will of course present more scaling issues, as we all know, it doesn’t scale 100% and it never will.
As Combatus says half the cores in the 64 core chip have to go via another die to get any memory access, that makes them slow in any operation that involves accessing main memory. That's a problem with the interconnect method that AMD are using. Remember that memory performance is very important for max single threaded performance, it's basically what they did going from ryzen 1 to 2 (improve memory performance) and it had a big impact.
That's not because of the interconnect though, it's because the TR4 socket doesn't support eight channel ram and 128 PCIe lanes.
Yes, two of the dies have to go via another die to access main memory (or any external I/O), and that makes them slower in operations that involve I/O but that's not because of the interconnect as Epyc doesn't have that limitation and it uses exactly the same interconnect.
The reason two dies don't have the ability to communicate directly with external I/O is literally because the TR4 socket doesn't have the electrical connections available to be able to do that.
So, if i've read it right... the top-end TRs are Epyc after a stroke... One side of the cpu can communicate directly with the system, the other side can't [and has to communicate via the side that can], whereas with epyc both sides can communicate directly [hence the extra pci lanes and mem channels].
End result being, if not necessarily intentionally, something along the lines of how some [or is it most now] smart phone cpus work - some fast cores for important stuff, and some slow[er] cores for the less important stuff.
And that is exactly what I was saying, just because Ripper has other scaling issues on top of that it doesn't remove the underlying problem of quickly diminishing returns in the core war.
Pretty much, TBH I'm struggling to see the point in any Threadripper with more than 16 cores as anything above that you'd probably be better of going with Epyc, although I'm not sure if Epyc allows you to disable half or a quarter of the cores so maybe a 16+ core Threadripper would be useful for someone who's main priority is workstation stuff but want a side order of after work gaming.
Best analogy ever.
Hopefully the Threadripper 3 will get something closer to a full Epyc socket.
Separate names with a comma.