Discussion in 'Article Discussion' started by bit-tech, 11 Jan 2018 at 10:52.
Given that large companies are not always quite as scrupulously honest as we would like, I hope we are not going to hear in the near future that a researcher, independent of Nvidia, has found something that Nvidia has kept quiet about.
I get the impression Nvidia are playing with words here, yes their GPUs probably aren't vulnerable but IIRC one of the reasons Nvidia GPUs are so frugal in term of power draw (TDP) is because they offload tasks onto the CPU, the scheduled springs to mind, so while their GPUs probably aren't vulnerable the task the drivers offload to the CPU probably are.
Any device driver (GPU, storage, networking, etc) that happens to talk to the kernel will be vulnerable to Spectre type attacks.
As for offloading work, CPU load is marginally lower with Nvidia's drivers than AMDs, as borne out in testing (or the driver overhead issues with Project Cars' excessive draw calls that were initially and incorrectly attributed to PhysX). For "but software scheduling!" specifically: both AMD and Nvidia have to dispatch all draw calls via the CPU (that's literally how drivers work). For DX11, AMD dispatches in a single thread (to a single command list), while Nvidia dispatches using multiple threads (and to multiple command lists). For DX12, both dispatch using a single thread (DX12 requirement) but to multiple command lists (enforcing explicit multiple-DCL rather than DX11's implicit-DCL-but-only-Nvidia-implemented-it). Actual warp scheduling (scheduling of commands themselves) is done in hardware for both. Regardless, draw calls need to go via the CPU in both cases, so both are vulnerable to Spectre type attacks while passing through.
Dispatching draw calls is not the same as the scheduler.
Dispatching draw calls is what the CPU does, the scheduler is what organises the queue of, among other things, draw calls, it's what preempts a process in the queue if a higher priority job arrives and it's what decides how the GPU pipeline is organised.
EDIT: Anandtech linky.
Obviously things have moved on from there but AFAIK they've stuck with software scheduling since then so in essence the CPU (via drivers/software) deals with what instructions, memory reads, etc, etc gets done when.
The scheduling done on the CPU is dispatch of calls to the warp schedulers (taking one incoming thread and splitting it into multiple lists), but command-level scheduling - and more importantly, all command-level execution - occurs on the GPU. The GPU itself does not perform speculative execution, so is not vulnerable to Spectre style attacks. The Scheduling Nvidia do on the CPU is of the draw calls that are already being dispatched by the CPU, so no more or less vulnerable than any other driver handling draw calls on the CPU.
tl;dr threading DCLs on the CPU does not increase the attack surface to Spectre style attacks, as they shuffle data that was already resident rather than executing any of it.
Updated to latest drivers today.
Obviously later on I got my first ever kernal stopped and recovered error with this card. Wasn't even doing anything at the time.
GPU's do more than just process draw calls.
The warp scheduler is only there to interrupt SMM queues when data that's already been dispatched to those queues needs to be interpreted by higher priority data and Nvidia warns against relying on the warp scheduler for normal operations as its got a very short queue depth and can lead to stalls in the pipeline.
However all that's sort of besides the point as it doesn't address the original point i raised, that I got the impression Nvidia are playing with words because while their GPUs may not be susceptible to these vulnerabilities the fact that they perform many of the tasks that used be done via hardware in software/drivers means the security flaws in the CPU have a larger effect than if they hadn't been.
And my point is that because the CPU has to dispatch jobs to the GPU (regardless of if you are running a game or doing a GPGPU task) the vulnerability is identical for snooping things in-flight on the CPU. AMD, Nvidia, Intel, Matrox (yes, they still exist!), whatever GPU is present; all are vulnerable until Spectre is mitigated, and that has yet to occur (and all signs point to that needing a fundamental change to how CPUs currently operate).
Yeah, but they don't make GPUs; they make graphics cards with AMD GPUs on 'em.
And my point is that because Nvidia do more work on the CPU they're more exposed, that most likely the CPU is doing more speculative execution and data reads/writes to the CPUs local memory on data before submitting it to the GPU.
And that when Jensen Huang said 'Our GPUs are immune. They’re not affected by these security issues,' (my emphasis) that he seems to be playing with words as although Spectre effects all CPUs and data being handled by the CPU because of the way Nvidia has offloaded more of the GPU tasks to the CPU they're also exposed to Meltdown, unlike i would speculate a GPU that retained much of the scheduler in hardware.
Which is not the way those exploits work: they provide a route to arbitrarily peek at protected memory contents, the memory being peeked at does not need to actually be touched by a syscall at all!
Oh FFS, the only other way he could have phrased that would've been "our GPUs are not affected by security issues" which would be patently false.
They DO still have some of their G400-derived cards on-the-books with active driver support though, which are entirely their own chips.
I didn't say anything about a syscall.
And BTW they don't allow arbitrary peeks at protected memory contents, they target specific addresses in the physical memory that's mapped to a virtual address space location in the kernels memory pool.
Passive aggressive much, there was I thinking we were having an interesting discussion but obviously you've been taking what we've been talking about as some sort of personal slight.
I was simply saying that because Nvidia offloads more of the computational tasks normally associated with the GPU onto the CPU via software/drivers that data being handled by Nvidia GPUs are probably more vulnerable than if they hadn't done that.
You sure? According to this, unless I'm reading it wrong, they haven't released an M-series, G-series, or P-series driver since 2013 - the year before they launched the AMD-based C-series.
And the attacker gets to choose that specific address, that's the whole issue.
And I'm trying to explain that's not in any way correct.
They have Windows 10 drivers for e.g. the G550 though for some of them they've only put it in the release notes rather than the OS column. Some of the drivers are as recent as October 2017!
Blimey - I sit corrected!
I've a Matrox MGA in a box somewhere. Lovely little thing.
Sorry but you're coming across as very confused and contradictory, first you say these vulnerabilities provide a route to arbitrarily (randomly, unspecified, undetermined, based on chance) peek at protected memory, and now you're saying the attacker gets to choose that specific address.
You'll have to forgive me for bowing out as I'm finding it difficult to follow your train of thought.
Like i said you'll have to forgive me but your explanations have been anything but clear, maybe it's just me who thinks you're being a little inconsistent but at this point i don't have the cognitive stamina to keep track of these twists and turns.
Ahem: "based on random choice or personal whim", i.e. the memory location peeked is up to the personal whim of the malware doing the peeking - not the rules of the system on what it should or should not be able to access.
Apologies, when i read arbitrarily my mind thinks random, not based on particular selection criteria, that's probably down to the weird way my brain works, or doesn't.
Sesquipedalian loquaciousness strikes again!
Separate names with a comma.