Your first EUE... gratz (ealry Unit End btw) Unfortunately this happens sometimes. I don't know the specifics of the "exception thrown" error, but EUE's occcur if you have an unstable overclock on hardware, your hardware is overheating, or there is an error in the setup of your client.
> Run: exception thrown during GuardedRun This means there has been an error during the work unit. This does happen, some units are (unknowingly) doomed from the start, or it may be a hardware problem > Folding@home Core Shutdown: UNSTABLE_MACHINE This is, rather broadly, the type of error. > + Results successfully sent > [19:53:50] Thank you for your contribution to Folding@Home. This means the work done has been successfully returned to Stanford. You will get the partial points for the work completed. Keep an eye on these errors, if you get several, then you may have a hardware fault - poor cooling, over-ambitious overclocking, just a rogue card. There is a third-party application called Fahwatch, which will look at your logs and alert you to problems. Worth having for the peace of mind.
AHA!!! i've been looking for someone else with this error for a while Code: [14:04:29] Completed 20% [14:09:25] Completed 21% [14:14:21] Completed 22% [14:19:16] Completed 23% [14:24:08] Completed 24% [14:29:13] Completed 25% [14:34:09] Completed 26% [14:35:53] + Working... [14:39:08] Completed 27% [14:44:06] Completed 28% [14:49:05] Completed 29% [14:54:06] Completed 30% [14:59:01] Completed 31% [15:03:59] Completed 32% [15:08:56] Completed 33% [15:13:51] Completed 34% [15:18:51] Completed 35% [15:23:47] Completed 36% [15:27:39] SEH code: 3221225477 [15:27:39] Run: exception thrown during GuardedRun [15:27:39] Run: exception thrown in GuardedRun -- Gromacs cannot continue further. [15:27:39] Going to send back what have done -- stepsTotalG=8000000 [15:27:39] Work fraction=0.3675 steps=8000000. [15:27:43] logfile size=173128 infoLength=173128 edr=0 trr=23 [15:27:43] - Writing 173664 bytes of core data to disk... [15:27:43] Done: 173152 -> 5491 (compressed to 3.1 percent) [15:27:43] ... Done. [15:27:43] [15:27:43] Folding@home Core Shutdown: UNSTABLE_MACHINE [15:27:47] CoreStatus = 7A (122) [15:27:47] Sending work to server [15:27:47] Project: 5911 (Run 5, Clone 89, Gen 1) I'm getting the same guarded run failure. every other EUE i've seen hasnt contained this... I can only think its happening to me with the 185 drivers... thing is i'm getting through some wu's ok but others are EUEing out. so it takes a while to test and be sure. @ Votick what hardware are you using & what drivers?
Im using an 8600GT with the latest drivers from the nvidia website. You say it could be cooling? It dose hit about 80+ sometimes so that could be the problem. It throws out so much heat that the case can't realy cope so I'v left the lid off for the moment. When the lid is on and after a few hours hits about 75C. Even goes up past 80, 89C the highest I have seen it using speedfan to measure.
I've investigated mine, by removing my CPU and gfx overclock, I have however only seen this problem since I moved to the 185.85 drivers. I have even tried adjusting my rivatuner auto fan modulation to bring the card to run at load at 72C and even then it still throws these errors. I'm gonna try and backdate but it takes a while to test because it doesn't fail on every wu.
I'm coming at this from a Linux viewpoint, having updated the the Linux Wine CUDA wrapper with the additional error code enums and new dummy stub methods for CUDA v2.2, which the 185 drivers ship with. Although, 2.1 -> 2.2 is supposed to be ABI compatible and indeed looks to be, I was getting some very strange behaviour with the CUDA sdk examples, compiled against v2.1, but using v2.2 at runtime. So much so, that knowing the Stanford code is compiled against an earlier version, I'd advocate against using any 185 version driver, regardless of the platform, until Stanford has validated it. I'm not sure that has happened yet. No doubt someone will correct me if I'm wrong, so I'll add the obligatory, YMMV.
Just noticed your sig. That's quite an aggressive overclock you have there on that 260 - 747/1533/1250. Maybe not the problem if you've backed it off. But from my experience with the GTX260, I'd be very surprised if that doesn't spit out an EUE, every 1 out of 4 WU's, running with core > 700MHz and shader clock > 1500MHz. Have you bumped the voltage?
You're right, JackOfAll, that is an ambitious overclock to be running f@h on with a 260. Is that air cooled as well? Slack that off a bit, run a few more WU's and see what happens would be my advice.
Only 'could be' cooling. Every card is different but I would not expect trouble at 75C, or even 85C. I'm told that most nVidia cards are setup to start throttling back when they reach 105C, so 30 degrees below that could be considered 'safe'. Just keep watch, see if you are getting frequent UNSTABLE_MACHINEs (above one failure in ten units), if not then your hardware is unlikely to be the cause.
Apologies, To clarify that, I'm not folding at those speeds. Those are just the maximum frequencies I've benchmarked my machine at. I have been folding at the stock speeds for my card (640/1363/1150) and have still had these errors. I am considering it to be a hardware failure possibly however as if I overclock my GTX 260 at all my machine locks to a coloured screen under heavy load (furmark) which can lock the machine if there is a spike of activity in games like crysis. Its hooked up to a 650w BeQuiet! Dark Power Pro so I'm pretty sure its not power that's the problem. I am thinking of reverting to 182.50 drivers and seeing if that cures the problem. Im also gonna have a look-see if its a specific wu that my cards tripping up on. Also wondering if it my CPU playing silly buggers with my folding client as it isnt *technically* supported by my motherboard it works fine as long as I dont disable cool & quiet edit: just got an nv4_disp bluescreen while folding so definately thinking about reverting drivers, although my WU has apparently survived
OK, understood. I just saw the sig, those numbers, and assumed you were folding with the card overclocked at those speeds. I'd be inclined to go ditch the 185 drivers for the earlier version, and perhaps even drop your XXX 640/1363/1150 down to 'standard 260' 576/1242/999 speeds, see if you get some stability for a WU or two, before moving your clocks back to XXX levels.
I just got another [19:53:45] Folding@home Core Shutdown: UNSTABLE_MACHINE On 8600GT #1 hmmm its not heat.
Memtest has come back fine. hmm any other idears? could it be a dodgy WU? the 700+ point ones seem to be the ones that screw up
Run the memtest for 2000 goes. This should take a while to run What's the rest of the kit? Have you tried reinstalling the client?
I ran the mem test and left it going came back and it had finished and closed down. 0 errors Rest of the kit is pretty basic AMD Athlon 3500+ CPU 1GB RAM I haven't try'd re-installing yet It's doing another 700+ point WU now so im going to see if it dies again.