1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

295GTX Issues

Discussion in 'bit-tech Folding Team' started by Vince_Kinslow, 28 Aug 2009.

  1. Vince_Kinslow

    Vince_Kinslow Folding Millionaire

    Joined:
    25 Apr 2009
    Posts:
    218
    Likes Received:
    2
    Reshifted my 295GTX cards into my Phenom quad machine Widows XP 32 Bit 2 x1TB Hitachi Hardisks 1 CD/DVD/Blueray player/recorder
    1 XFX295
    1 EVGA 295 previously running fine in the other machine even earlier on today. 2nd core is EUE ing

    Downloaded the new Nvidia 190.62_desktop_winxp_32bit_international_whql.exe driver which includes CUDA2.3 not used this before.

    Could it be the power supply not man enough it ran 4 x 8800GT cards previosly no problems its a Enermax MODU 82+ 625Watt perhaps I should change the Corsair HX1000W into the Phenom machine as thats only running 2 8800GT's

    Is there another program like Rivatuner as it does not like the new Nvidia drivers which are not supported Also some form of monitoring tool

    19:16:08] - Digital signature verified
    [19:16:08]
    [19:16:08] Project: 5787 (Run 11, Clone 102, Gen 18)
    [19:16:08]
    [19:16:08] Assembly optimizations on if available.
    [19:16:08] Entering M.D.
    [19:16:15] Working on Protein
    [19:16:16] Client config found, loading data.
    [19:16:16] Starting GUI Server
    [19:17:19] Completed 1%
    [19:17:19] mdrun_gpu returned
    [19:17:19] NANs detected on GPU
    [19:17:19]
    [19:17:19] Folding@home Core Shutdown: UNSTABLE_MACHINE
    [19:17:22] CoreStatus = 7A (122)
    [19:17:22] Sending work to server
    [19:17:22] Project: 5787 (Run 11, Clone 102, Gen 18)
    [19:17:22] - Read packet limit of 540015616... Set to 524286976.
    [19:17:22] - Error: Could not get length of results file work/wuresults_04.dat
    [19:17:22] - Error: Could not read unit 04 file. Removing from queue.
    [19:17:22] - Preparing to get new work unit...
    [19:17:22] + Attempting to get work packet
    [19:17:22] - Connecting to assignment server
    [19:17:23] - Successful: assigned to (171.64.65.106).
    [19:17:23] + News From Folding@Home: Welcome to Folding@Home
    [19:17:23] Loaded queue successfully.
    [19:17:25] + Closed connections
    [19:17:30]
    [19:17:30] + Processing work unit
    [19:17:30] Core required: FahCore_11.exe
    [19:17:30] Core found.
    [19:17:30] Working on queue slot 05 [August 28 19:17:30 UTC]
    [19:17:30] + Working ...
    [19:17:30]
    [19:17:30] *------------------------------*
    [19:17:30] Folding@Home GPU Core - Beta
    [19:17:30] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
    [19:17:30]
    [19:17:30] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
    [19:17:30] Build host: amoeba
    [19:17:30] Board Type: Nvidia
    [19:17:30] Core :
    [19:17:30] Preparing to commence simulation
    [19:17:30] - Looking at optimizations...
    [19:17:30] - Created dyn
    [19:17:30] - Files status OK
    [19:17:30] - Expanded 67237 -> 350744 (decompressed 521.6 percent)
    [19:17:30] Called DecompressByteArray: compressed_data_size=67237 data_size=350744, decompressed_data_size=350744 diff=0
    [19:17:30] - Digital signature verified
    [19:17:30]
    [19:17:30] Project: 5787 (Run 11, Clone 102, Gen 18)
    [19:17:30]
    [19:17:30] Assembly optimizations on if available.
    [19:17:30] Entering M.D.
    [19:17:36] Working on Protein
    [19:17:36] mdrun_gpu returned
    [19:17:36] Self-test failure
    [19:17:36]
    [19:17:36] Folding@home Core Shutdown: UNSTABLE_MACHINE
    [19:17:40] CoreStatus = 7A (122)
    [19:17:40] Sending work to server
    [19:17:40] Project: 5787 (Run 11, Clone 102, Gen 18)
    [19:17:40] - Read packet limit of 540015616... Set to 524286976.
    [19:17:40] - Error: Could not get length of results file work/wuresults_05.dat
    [19:17:40] - Error: Could not read unit 05 file. Removing from queue.
    [19:17:40] EUE limit exceeded. Pausing 24 hours.

    Folding@Home Client Shutdown.


    Any help suggestions gratefully received as ever
     
    Last edited: 29 Aug 2009
  2. Norfolk'N'Good

    Norfolk'N'Good Folding Chimp

    Joined:
    23 Apr 2009
    Posts:
    278
    Likes Received:
    4
    Hi Vince, just to let you i have has exactly the same issues on a 8800GT, your log file would read the same as mine with three or four unstable machine errors in quick time then the 24 hr delay.
    I posted a while back where i thought it was dying but then it was stable for a week and a bit folding under CustomPC name and now it wont even fire up 1% work without the log file reading like yours.

    Strange one cos its a fairly new card three months or so

    What is the average liftime for a folding card i wonder?
     
  3. phoenicis

    phoenicis Retired Chimp

    Joined:
    26 Apr 2009
    Posts:
    493
    Likes Received:
    9
    Hi Vince,

    Not sure if it's the cause of your problems but from what I've read there seems to be a issue with the 190.62 drivers and GTX295s. Best to stick with 190.38.

    The PSU could be an issue with respect to load balancing so switching to the Corsair would be a good idea if for no other reason than to rule it out.

    I use EVGA precision for all my GPU monitoring and OCing in Windows.

    Hope this helps.

    Bob

    Edit: Assume your running with SLI disabled and the -forcegpu nvidia_g80 flag?
     
    Last edited: 28 Aug 2009
  4. Vince_Kinslow

    Vince_Kinslow Folding Millionaire

    Joined:
    25 Apr 2009
    Posts:
    218
    Likes Received:
    2
    :wallbash: Ok uninstalled the EVGA program as Riva has picked up the Nvidia drivers now.

    Problem with the 2nd core of the EVGA card seems to have resolved its self and is folding OK now

    But Core one of the XFX card now has the same problem as the 2nd core of the EVGA exhibited.

    Will swap PSU tomorrow see how things go.
    Just downloaded the Nvidia 190.38 driver will install if the PSU does not solve the issue will keep you posted :wallbash:
     
  5. Vince_Kinslow

    Vince_Kinslow Folding Millionaire

    Joined:
    25 Apr 2009
    Posts:
    218
    Likes Received:
    2
    OK have changed over PSU put in the "-forcegpu nvidia_g80" flag in the advanced tab from the Folding@Home control panel using the CUDA cudadriver_2.2_winxp_32_185.85_general.exe driver multi GPU disabled Physix enabled extended desktop on all monitors shown but still the XFX gpu1 is EUE ing any ideas? I did try the 190.38 but had issues not sure if I extended the desk top to be honest. Tried and spent so much time on this today its doing my head in.
    Going to go and dig some trees out of the garden to let off some steam :worried:
     
  6. phoenicis

    phoenicis Retired Chimp

    Joined:
    26 Apr 2009
    Posts:
    493
    Likes Received:
    9
    Sorry Vince, I thought that you were swapping 285s with a 295, from the 1st line of the OP, and I've just clicked that you may have been running 2 x 295s on a 625w PSU. If I'm now understanding you correctly, these plus a quad core under load can pull upto 650w from the wall and so that was pretty darn ambitious. I'm not sure that it won't have caused some harm.

    I've used the 198.38 drivers under XP64 but not 185.85. I'm not sure how much difference it makes but I normally turn phys-x off and add the -forcegpu nvidia_g80 flag to all of the GPU clients including the one attached to the display.

    How much memory is left under XP32 with the 2 x 295s installed? Are you running SMP clients at the same time?

    Sorry I'm not being much help mate. I've become a little rusty with XP since going over to the dark side.
     
  7. Vince_Kinslow

    Vince_Kinslow Folding Millionaire

    Joined:
    25 Apr 2009
    Posts:
    218
    Likes Received:
    2
    I am not running any cpu clients the PSU is now the Corair HX1000W, the EVGA 295 I have had a while and has never had any issues todate but it was the card that originally lost a GPU core to EUE's not the XFX which is new part exchanged a RMA'd 8800GT towards it only received it on Thursday. Physix is now turned off although I have never had an issue with this in the past being on. Foldind@home Client GPU V6.23 x 4 with the -forcegpu nvidia_g80 flag
    Here is the full specification

    Computer:
    Computer Type ACPI Multiprocessor PC
    Operating System Microsoft Windows XP Professional
    OS Service Pack Service Pack 3
    Internet Explorer 8.0.6001.18702
    DirectX 4.09.00.0904 (DirectX 9.0c)
    Computer Name PHENOM_QUAD (Vince' machine)
    User Name Vince
    Logon Domain PHENOM_QUAD
    Date / Time 2009-08-29 / 16:19

    Motherboard:
    CPU Type QuadCore AMD Phenom 9350e, 2000 MHz (10 x 200)
    Motherboard Name Asus M3A32-MVP Deluxe (2 PCI, 4 PCI-E x16, 4 DDR2 DIMM, Audio, Gigabit LAN, IEEE-1394)
    Motherboard Chipset AMD 790FX, AMD K10
    System Memory 2304 MB (DDR2-800 DDR2 SDRAM)
    DIMM1: Corsair Dominator CM2X2048-8500C5D 2 GB DDR2-800 DDR2 SDRAM (5-5-5-18 @ 400 MHz) (4-4-4-13 @ 270 MHz)
    DIMM2: Corsair Dominator CM2X2048-8500C5D 2 GB DDR2-800 DDR2 SDRAM (5-5-5-18 @ 400 MHz) (4-4-4-13 @ 270 MHz)
    BIOS Type AMI (11/21/08)
    Communication Port Communications Port (COM1)

    Display:
    Video Adapter NVIDIA GeForce GTX 295 (896 MB)
    Video Adapter NVIDIA GeForce GTX 295 (896 MB)
    Video Adapter NVIDIA GeForce GTX 295 (896 MB)
    Video Adapter NVIDIA GeForce GTX 295 (896 MB)
    3D Accelerator nVIDIA GT200-400

    Windows My Computer properties shows 2.25GB RAM
     
  8. phoenicis

    phoenicis Retired Chimp

    Joined:
    26 Apr 2009
    Posts:
    493
    Likes Received:
    9
    Have you tried reseating or swapping the cards over? Does it exhibit the same behaviour if you place it back in the original PC?

    I can't fathom what else it could be other than (gulp) a faulty GPU. I almost avoided 295s completely after two of the first three I bought had one of their GPUs go south within the first few days. Subsequent ones have behaved just fine but the first week or so seems to be make or break for these cards.
     
  9. Vince_Kinslow

    Vince_Kinslow Folding Millionaire

    Joined:
    25 Apr 2009
    Posts:
    218
    Likes Received:
    2
    I will try swapping them over tomorrow spent too much time today playing around much to the good ladies disgust I was supposed to help her out in the garden today :naughty: Still did get a couple of hours out there, done some shopping and been down the dump so I should be quids in again :nono:
     
  10. JackOfAll

    JackOfAll What's a Dremel?

    Joined:
    23 Apr 2009
    Posts:
    671
    Likes Received:
    6
    I would have agreed with you about the first week infant mortality failures, before last night. But it looks like I lost one of my single PCB GTX295's. I had to reboot this morning. On checking the logs, got a bunch of messages about disabling the interrupts on the "bad" card, it stopped folding on both GPU's, while the other cards continued for 20 mins before the machine hard locked. "Bad" card is no longer initialised by the bios so it can't be seen by the OS or nvflash from a dos boot. Ho hum.
     
  11. phoenicis

    phoenicis Retired Chimp

    Joined:
    26 Apr 2009
    Posts:
    493
    Likes Received:
    9
    Sorry to hear the bad news Clive.

    We do give them a fair amount of punishment and I suspect that the reliability of GTX295s just isn't good enough for our purposes (ie OCed folding). Other than a DOA and the infamous PNY 8800s, the 295s are the only cards that have failed on me.
     
  12. Unicorn

    Unicorn Uniform November India

    Joined:
    25 Jul 2006
    Posts:
    12,726
    Likes Received:
    456
    Sorry to hear they didn't last long Clive, it seems that my holding off on the 295's might have paid off after all. Does anyone know what the £110 GTX 260's at Scan are capable of PPD wise? Was thinking of getting a couple for one of my dual card rigs before the offer ends.
     

Share This Page