bit-tech.net

Go Back   bit-tech.net Forums > bit-tech.net > Article Discussion

Reply
 
Thread Tools
Old 8th Feb 2012, 13:39   #1
brumgrunt
Ultramodder
 
brumgrunt's Avatar
 
Join Date: Dec 2011
Posts: 1,009
brumgrunt is a hoopy frood who really knows where their towel is.brumgrunt is a hoopy frood who really knows where their towel is.brumgrunt is a hoopy frood who really knows where their towel is.brumgrunt is a hoopy frood who really knows where their towel is.brumgrunt is a hoopy frood who really knows where their towel is.brumgrunt is a hoopy frood who really knows where their towel is.brumgrunt is a hoopy frood who really knows where their towel is.brumgrunt is a hoopy frood who really knows where their towel is.brumgrunt is a hoopy frood who really knows where their towel is.brumgrunt is a hoopy frood who really knows where their towel is.brumgrunt is a hoopy frood who really knows where their towel is.
New GPGPU approach promises 20 per cent performance boost

In a paper co-authored by AMD...

http://www.bit-tech.net/news/hardwar...rmance-boost/1
brumgrunt is offline   Reply With Quote
Old 8th Feb 2012, 13:50   #2
Hustler
Supermodder
 
Hustler's Avatar
 
Join Date: Aug 2005
Location: The Dark Side
Posts: 595
Hustler should be considered for presidentHustler should be considered for presidentHustler should be considered for presidentHustler should be considered for presidentHustler should be considered for presidentHustler should be considered for presidentHustler should be considered for presidentHustler should be considered for presidentHustler should be considered for presidentHustler should be considered for presidentHustler should be considered for president
Whats this?...more promises of Jam tomorrow from AMD hardware?

...sigh
Hustler is offline   Reply With Quote
Old 8th Feb 2012, 16:10   #3
debs3759
Was that a warranty I just broke?
 
debs3759's Avatar
 
Join Date: Oct 2011
Location: Northampton
Posts: 1,600
debs3759 is a Super Spamming Saiyandebs3759 is a Super Spamming Saiyandebs3759 is a Super Spamming Saiyandebs3759 is a Super Spamming Saiyandebs3759 is a Super Spamming Saiyandebs3759 is a Super Spamming Saiyandebs3759 is a Super Spamming Saiyandebs3759 is a Super Spamming Saiyandebs3759 is a Super Spamming Saiyandebs3759 is a Super Spamming Saiyandebs3759 is a Super Spamming Saiyan
I like that AMD are looking into newer and better ways of doing things. Fully integrating the CPU and GPU will, IMO, be a very good thing. We might even see AMD take back the lead in high-end computing in a few short years (Remember AMD64? You should, it's the basis for the instruction set in every 64-bit x86 CPU we see today!)
__________________


HeatWare
debs3759 is offline   Reply With Quote
Old 8th Feb 2012, 17:00   #4
borandi
Multimodder
 
Join Date: Jan 2010
Location: London
Posts: 128
borandi has yet to learn the way of the Dremelborandi has yet to learn the way of the Dremelborandi has yet to learn the way of the Dremel
You act very sceptical in this news story. This is understandable. But any university that wants to work on stuff like this has to have a good partnership with some aspect of the manufacturer - e.g. NVIDIA have a lot of academic people around the word who act as reps for their institution, promoting GPGPU and exposing the members of their institutions to as much stuff as possible. As a result, those who collaborate with the academics (in this case, Mike Mantor) on a significant level are put in as co-authors even if they didn't directly do any research themselves, they contributed a lot to the outcome.
borandi is offline   Reply With Quote
Old 8th Feb 2012, 17:12   #5
tonyd223
king of nothing
 
Join Date: Nov 2009
Location: Hull
Posts: 318
tonyd223 has yet to learn the way of the Dremel
but will it run Cry...
tonyd223 is offline   Reply With Quote
Old 8th Feb 2012, 17:12   #6
DbD
Supermodder
 
Join Date: Dec 2007
Posts: 281
DbD is a hoopy frood who really knows where their towel is.DbD is a hoopy frood who really knows where their towel is.DbD is a hoopy frood who really knows where their towel is.DbD is a hoopy frood who really knows where their towel is.DbD is a hoopy frood who really knows where their towel is.DbD is a hoopy frood who really knows where their towel is.DbD is a hoopy frood who really knows where their towel is.DbD is a hoopy frood who really knows where their towel is.DbD is a hoopy frood who really knows where their towel is.DbD is a hoopy frood who really knows where their towel is.DbD is a hoopy frood who really knows where their towel is.
That 20% is a picked out the air figure, and seems very small if they actually want to make use of it.

There's a load of overhead required to set-up some bit of code to run on the gpu not the cpu. Equally only some bits of code are faster - if the lump of code is too small then the overhead of setting it up for gpu outweighs the benefit it gives. Hence that 20% figure is based on a guess for overhead, and a guess for code that would work well with gpu, all this done without real silicon - just a simulation of some *future* silicon. Being as the research is set-up to show this is possible (this is the result AMD will have wanted) 20% is a very low number to have come out with.

Then there's the other things - e.g. power usage. Done on a traditional cpu might have been 20% slower but it only required cpu, where as your 20% performance gain now requires all the extra complexity and power requirements of a gpu. Would it have been more efficient power wise to just have a more power hungry cpu that ran 20% faster?

I think this sort compute tends to work really well when you can allocate big lumps of code, or full jobs to some specialised hardware (e.g. video decode), but for just working on standard code in combination with cpu it doesn't looks so effective.
DbD is offline   Reply With Quote
Old 8th Feb 2012, 18:03   #7
azazel1024
Supermodder
 
Join Date: Jun 2010
Location: Baltimore, Maryland USA
Posts: 487
azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.
On what kind of a workload? We already know that there are plenty of workloads that are single threaded only, or at most a bare 2-6 threads. The only time a GPU MAY help is in a scenario where the calculations/work you are doing has more threads than there are CPU cores (or with hyperthreading, really virtual cores to) to handle it. Even then, with the rather low frequency and low IPC of each individual "stream processor" in a GPU, you have to get to fairly high threaded calculations to see a performance benifit of GPU over CPU calculations. I have no idea the numbers, but I'd suspect we are talking at least double the threads of the CPU core count of a lot more.

Heck, look at H.264 encoding, which can be pretty massively threaded. High end discrete GPUs only manage something like 50-150% odd faster encoding than a high end Intel CPU...and they have to take short cuts which compromises image quality some. That is with like 800+ stream processors versus 4-6 real cores.

Sure there are things that make sense to process on a GPU with all its many cores, but most things in general computing are still best left to the CPU, or at most might gain a little assistance from having the GPU step in to handle some of the processing (example rendering a webpage).

However, how much actual performance gain is there on having the CPU and GPU on die, sharing L3 cache and workloads? As compared to say a discrete GPU sharing some of the processing with the CPU? I'd assume at least a small percent gain due to significantly lower latency and shared cache...but it is as much of a gain as the processing ability of a discrete GPU over an integrated GPU? After all current day AMD Llano top of the line GPU is not much better than a 6450 discrete GPU, basically the most bottom of the barrel. Even with shared cache and more main memory bandwidth I doubt the iGPU performance improves that much.

Now compared that with higher clock rates, much greater memory bandwidth and a significant number more stream processors of something like a 550 or 6670 even? Despite having much, much higher latency?

I see shared CPU/GPU processing on the core as important, but I really suspect that the numbers the researchers came up with are specious at best. I doubt that compares shared CPU to GPU computing with discrete GPU (even the exact same performance discrete GPU, but with the higher lag resulting from the discreteness).
__________________
Core 2 Duo E7500, MSI P43-C51, 2x2GB G.Skill Eco DDR3 1600 Cas 7, OCZ Vertex 60GB, Vertex 30GB, Samsung F3 500GB, Samsung F4 2TB, Powercolor Go Green ATI 5570
azazel1024 is offline   Reply With Quote
Old 8th Feb 2012, 19:01   #8
FelixTech
Robot
 
FelixTech's Avatar
 
Join Date: Jun 2009
Location: London
Posts: 346
FelixTech has yet to learn the way of the DremelFelixTech has yet to learn the way of the Dremel
I think it will be interesting if they can reduce or remove the time spent transferring data between CPU and GPU memories. There are many things that can be done well with GPUs, but where latency is prioritised over throughput the time for a return trip to the GPU is simply too big at the moment.
__________________
Use ClickMechanic for car repairs and servicing!
Play Animal Shogi for free on Android!
FelixTech is offline   Reply With Quote
Old 8th Feb 2012, 20:01   #9
schmidtbag
Hypermodder
 
Join Date: Jul 2010
Location: MA, USA
Posts: 798
schmidtbag - may the hammer of Bindi be bestowed on youschmidtbag - may the hammer of Bindi be bestowed on youschmidtbag - may the hammer of Bindi be bestowed on youschmidtbag - may the hammer of Bindi be bestowed on youschmidtbag - may the hammer of Bindi be bestowed on youschmidtbag - may the hammer of Bindi be bestowed on youschmidtbag - may the hammer of Bindi be bestowed on youschmidtbag - may the hammer of Bindi be bestowed on youschmidtbag - may the hammer of Bindi be bestowed on youschmidtbag - may the hammer of Bindi be bestowed on youschmidtbag - may the hammer of Bindi be bestowed on you
amd and their partners are absolutely right in the sense that in order to get the best performance, the cpu and gpu need to work together as a single unit, but as long as x86 is the dominant architecture, amd can't expect programs to follow in their desired steps. intel is "happy" with their current cpu+gpu setup because they have plans to just keep improving performance without the need to increase clock speed or slap on more cores. both sandy bridge and ivy bridge are perfect examples of this.

if intel followed amd's decision, then i think we'd get a whole other world of computing, in a good way. but as long as intel doesn't want to follow amd, the idea won't take off. i feel like if any company wants to do the fused cpu and gpu idea, they might as well create an entirely new architecture from scratch, and we all know that isn't going to happen. what amd wants to do is effectively what SPARC and PPC currently do.
__________________
4.4GHz FX-6300 (on an AM3 board) with C'n'Q on, 8GB of RAM, 2x ATI HD5750, ADATA SP900 64GB SSD, Arch Linux 64 bit.
schmidtbag is offline   Reply With Quote
Old 8th Feb 2012, 20:03   #10
velo
Minimodder
 
velo's Avatar
 
Join Date: Feb 2012
Location: Zummerzet, where the zider comes from
Posts: 23
velo has yet to learn the way of the Dremel
Quote:
Originally Posted by DbD
That 20% is a picked out the air figure, and seems very small if they actually want to make use of it.
"Using synthetic benchmarks, Zhou's team was able to show significant performance gains using the CPU-assisted GPU model. On average, benchmarks ran 21.4 per cent faster..."

Doesn't seem particularly air-like to me...
velo is offline   Reply With Quote
Old 9th Feb 2012, 07:25   #11
fluxtatic
Supermodder
 
Join Date: Aug 2010
Location: Seattle
Posts: 492
fluxtatic has yet to learn the way of the Dremel
Quote:
Originally Posted by azazel1024
Heck, look at H.264 encoding, which can be pretty massively threaded. High end discrete GPUs only manage something like 50-150% odd faster encoding than a high end Intel CPU...and they have to take short cuts which compromises image quality some. That is with like 800+ stream processors versus 4-6 real cores.
If you mean SB, you're a bit off - SB is using dedicated hardware for Quick Sync. Not that I mean your results are wrong, just the reasoning behind it. QS is ridiculously fast, yes, but it isn't directly the result of the magic of the SB arch itself, just that Intel saw fit to cram dedicated hardware to handle that job, and only that job, onto the die. Compare it to the previous Core arch, and the picture is a bit different.

I'm always a bit suspicious of this type of research, in that it isn't actual silicon they're working on. Rather, it's a model of some future arch that may never come to be. Or they tuned it, not even necessarily intentionally, to crank out results, not taking into account that what they ended up designing won't be practical as an actual CPU.

Quote:
Originally Posted by DbD View Post
That 20% is a picked out the air figure, and seems very small if they actually want to make use of it.
Have you read the paper? I'm actually asking, as I haven't. If they were going to start pulling figures out of...some place, why go so low? Or, shall we put on our tinfoil hats and realize it's a cunning scheme - make it sound good, but not suspiciously good. Otherwise people won't believe it. What have they got to lose, with so many people shitting on them now?

Even here, it's starting to feel as if people want AMD to fail. You think Intel won't get lazy (and even more expensive) with zero competition? Intel, the company so dedicated to the enthusiast community they'll sell you insurance on your processor in case you blow it up. Never mind the fine print that essentially says they can wriggle out of the obligation to replace the hardware, leaving you with no recourse. If they're so dedicated to us, give me the crack dealer model: first one is always free. Kill it, they'll replace it no questions asked. But that's it. Blow up the replacement and you're back on Newegg or Scan like every other sucker. That's dedication to the community...and it isn't like they can't afford it. This market segment is such a tiny portion of their revenue, they could start a "send us a picture of the PC you're building and we'll send you the processor for free" program and you wouldn't even see a dent in their quarterly revenues.

On-topic, though, this is exciting. Rather than piss on their shoes, cheer them on and hope they're right. Last time they had something that was a great leap forward (one that succeeded, that is), the result is all of us using 64-bit processors. They were also the first in x86 using true, native multicore processors, as well.

Between this and Intel's Haswell announcement (http://arstechnica.com/business/news...el-haswell.ars), this is a big day in hardware news - be happy!
fluxtatic is offline   Reply With Quote
Old 9th Feb 2012, 11:03   #12
Snips
I can do dat, giz a job
 
Snips's Avatar
 
Join Date: Sep 2010
Location: wiv me kids
Posts: 1,887
Snips is a Super Spamming SaiyanSnips is a Super Spamming SaiyanSnips is a Super Spamming SaiyanSnips is a Super Spamming SaiyanSnips is a Super Spamming SaiyanSnips is a Super Spamming SaiyanSnips is a Super Spamming SaiyanSnips is a Super Spamming SaiyanSnips is a Super Spamming SaiyanSnips is a Super Spamming SaiyanSnips is a Super Spamming Saiyan
How long ago was AMD64 again? Why is it always mentioned ever since everytime AMD release a dissappointed processor?

The word for today is "Simulated"

I'm sure every processor "on paper" performs like a demon, it's manufacturing the idea that AMD fall down.
Snips is offline   Reply With Quote
Old 9th Feb 2012, 11:39   #13
Nexxo
Whatever's Geek.
 
Nexxo's Avatar
 
Join Date: Oct 2001
Location: Birmingham, UK
Posts: 26,165
Nexxo is a Super Spamming SaiyanNexxo is a Super Spamming SaiyanNexxo is a Super Spamming SaiyanNexxo is a Super Spamming SaiyanNexxo is a Super Spamming SaiyanNexxo is a Super Spamming SaiyanNexxo is a Super Spamming SaiyanNexxo is a Super Spamming SaiyanNexxo is a Super Spamming SaiyanNexxo is a Super Spamming SaiyanNexxo is a Super Spamming Saiyan
Interesting concept. Wouldn't dismiss it just because AMD is the one experimenting with it. I remember a time when AMD beat the pants off Intel and I've been around long enough to know that whatever happened can happen again.
__________________
In memory of Kidmod-Southpaw (1997 - 2014)
a fellow geek, modder, dreamer of dreams
https://www.justgiving.com/kidmod
Nexxo is offline   Reply With Quote
Old 9th Feb 2012, 13:11   #14
Guinevere
Mega Mom
 
Guinevere's Avatar
 
Join Date: May 2010
Posts: 2,067
Guinevere is a Super Spamming SaiyanGuinevere is a Super Spamming SaiyanGuinevere is a Super Spamming SaiyanGuinevere is a Super Spamming SaiyanGuinevere is a Super Spamming SaiyanGuinevere is a Super Spamming SaiyanGuinevere is a Super Spamming SaiyanGuinevere is a Super Spamming SaiyanGuinevere is a Super Spamming SaiyanGuinevere is a Super Spamming SaiyanGuinevere is a Super Spamming Saiyan
Quote:
Originally Posted by DbD
That 20% is a picked out the air figure
Ahhh peer review is always at it's very very best when undertaken by someone unqualified and unwilling to read the paper.

Well done sir. Well done I say.
Guinevere is offline   Reply With Quote
Old 9th Feb 2012, 14:28   #15
azazel1024
Supermodder
 
Join Date: Jun 2010
Location: Baltimore, Maryland USA
Posts: 487
azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.azazel1024 is the Cheesecake. Relix smiles down upon them.
Quote:
Originally Posted by fluxtatic
Quote:
Originally Posted by azazel1024
Heck, look at H.264 encoding, which can be pretty massively threaded. High end discrete GPUs only manage something like 50-150% odd faster encoding than a high end Intel CPU...and they have to take short cuts which compromises image quality some. That is with like 800+ stream processors versus 4-6 real cores.
If you mean SB, you're a bit off - SB is using dedicated hardware for Quick Sync. Not that I mean your results are wrong, just the reasoning behind it. QS is ridiculously fast, yes, but it isn't directly the result of the magic of the SB arch itself, just that Intel saw fit to cram dedicated hardware to handle that job, and only that job, onto the die. Compare it to the previous Core arch, and the picture is a bit different.

I'm always a bit suspicious of this type of research, in that it isn't actual silicon they're working on. Rather, it's a model of some future arch that may never come to be. Or they tuned it, not even necessarily intentionally, to crank out results, not taking into account that what they ended up designing won't be practical as an actual CPU.

Quote:
Originally Posted by DbD View Post
That 20% is a picked out the air figure, and seems very small if they actually want to make use of it.
Have you read the paper? I'm actually asking, as I haven't. If they were going to start pulling figures out of...some place, why go so low? Or, shall we put on our tinfoil hats and realize it's a cunning scheme - make it sound good, but not suspiciously good. Otherwise people won't believe it. What have they got to lose, with so many people shitting on them now?

Even here, it's starting to feel as if people want AMD to fail. You think Intel won't get lazy (and even more expensive) with zero competition? Intel, the company so dedicated to the enthusiast community they'll sell you insurance on your processor in case you blow it up. Never mind the fine print that essentially says they can wriggle out of the obligation to replace the hardware, leaving you with no recourse. If they're so dedicated to us, give me the crack dealer model: first one is always free. Kill it, they'll replace it no questions asked. But that's it. Blow up the replacement and you're back on Newegg or Scan like every other sucker. That's dedication to the community...and it isn't like they can't afford it. This market segment is such a tiny portion of their revenue, they could start a "send us a picture of the PC you're building and we'll send you the processor for free" program and you wouldn't even see a dent in their quarterly revenues.

On-topic, though, this is exciting. Rather than piss on their shoes, cheer them on and hope they're right. Last time they had something that was a great leap forward (one that succeeded, that is), the result is all of us using 64-bit processors. They were also the first in x86 using true, native multicore processors, as well.

Between this and Intel's Haswell announcement (http://arstechnica.com/business/news...el-haswell.ars), this is a big day in hardware news - be happy!
I was refering to x86 encoding of h.264 compared to GPU 580 or 5870 encoding. Quick synch is faster than GPU h.264 encoding, and it appears to actually deliver on par or maybe better quality than GPU encoding. x86 CPU encoding delivers by far the best quality, though at speeds that are roughly half or so of faster GPU cards. However, if you look at power use...overall energy used for encoding might actually be better on a sandybridge processor than a GPU. If the high end GPU can do it twice as fast, but uses 3 times the power...

Anyway, my point is that as things stand this second, GPU encoding is nice, but it isn't a panacea. I think as it regards the APU that AMD and Intel seem to be putting together, with both moving further and further along the path of SOCs (Haswell pretty much will be a SOC with just about everything moved on die), a GPU is going to be "critical", but not nearly as good as discrete cards. Heck, just look at the die area of Nvidia and AMD high end discrete cards right now. Vaguely 300mm^2. That is about 50% bigger than Sandybridge, which is already using a big chunk for the GPU (roughly half? A third?). As process size shrinks, I think we'll see the GPU portion of the APU/CPU get bigger and bigger, however, it is likely to still be smaller and less powerful than what you'll find in discrete GPUs.

I do think at some point in the next 1-3 CPU generations (maybe by Haswell?) we'll see a complete disapperance of the low and maybe even the mid-low GPU markets. Ivy Bridge looks like it may be on par wtih a 6550 and AMDs GPU in Llano is just about on par with that as well. Haswell sounds like it is probably going to improve on Ivy Bridge anywhere from 25-100% and Trinity is likely to be better than Llano. Integrated GPUs are certainly improving faster than Discrete graphics are.

Two keys to integrated GPUs though is going to be closer integration with the CPU (Intels shared L3 cache, AMD deciding to implement a real L3/Shared cache) as well as a larger main memory pipeline and/or dedicated VRAM slots. However, Intel at least, and to a lesser degree AMD it seems, are moving toward lower power CPU/APUs. In part because of portable computing, but also because of the server space and desktop CPUs are mirroring this as well. So the discrete GPU is always going to be much more powerful, so long as you don't mind coughing up the money. Your average mid-range card with a TDP in the 100-150W range is going to be much more powerful than a combined CPU/GPU that total might have a TDP of 65-130w.

So integrated GPUs can accelerate somethings and be significantly lower latency than a discrete GPU. However, for raw processing power, a discrete GPU is still going to be head shoulders more powerful. It is really just going to be in situations where low latency is required that the iGPU is going to be better than a discrete card, or in a situation where there is no discrete card present, which the market is quickly moving toward as integrated GPUs start becoming "Good enough" for basic users, corporate computing and casual gamers. Heck at the rate of improvement they are going to be good enough for even heavy gamers who are on a budget or have lower resolution displays (I'd say give it 3-4 years and iGPUs are going to be able to handle 1080p with medium/high settings at >30FPS in basically all games, though hopefully by then the >20" monitor group is going to be standardizing on something more like >1500p).
__________________
Core 2 Duo E7500, MSI P43-C51, 2x2GB G.Skill Eco DDR3 1600 Cas 7, OCZ Vertex 60GB, Vertex 30GB, Samsung F3 500GB, Samsung F4 2TB, Powercolor Go Green ATI 5570
azazel1024 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 09:57.
Powered by: vBulletin Version 3
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.