1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Graphics Nvidia SLI crashing. - Finally Fixed!

Discussion in 'Hardware' started by .//TuNdRa, 15 May 2012.

  1. Chicken76

    Chicken76 Minimodder

    Joined:
    10 Nov 2009
    Posts:
    952
    Likes Received:
    32
    Sounds like a power-related issue to me. See if you can borrow a good PSU for a test.

    Also, if you suspect there's something fishy with the second slot, leave only one card in the machine, in that slot. Run the tests, synthetics and games and see if there are any issues.
     
  2. .//TuNdRa

    .//TuNdRa Resident Bulldozer Guru

    Joined:
    12 Feb 2011
    Posts:
    4,046
    Likes Received:
    109
    And the end result is: the motherboard is being a little beech. Swapping the GPU into the second PCi-E socket produced the same stuttering and momentary lock-ups that I got when SLI was enabled, so it's down to a motherboard issue. I hunted around in the bios and upped the VCC18, HTlink and NB voltages slightly, same issue.

    I really don't want to have to RMA this board again, so this would be a nice solution. I'll see if there aren't any other alternatives (Like going back to my previously working F6F Bios and seeing if that makes any difference) first, before I go down that route.

    EDIT; OKay, So I just tore down my PC, checking for bent pins in the PCI-E 16x 2 socket and to see if anything was shorting on the back or anything like that, replaced the TIM on my northbridge while I was there. (Default tim was still there, but it was rather like dried chewing Gum, so I thought it wasn't doing that good a job, replaced it.), in doing so; I poked the endmost pin on the PCI-E 16X 2 socket, because it looked out of shape, then gave up because it looked no different. Reassembled the machine; It magically works.

    I have no clue what I did, there was nothing shorting it that I could see, it's just suddenly working. I daren't complain or fiddle, just in case it stops working again.

    Second edit; Then I restart, straight afterwards, and it goes straight to pot again, fails to boot, blue-screen on startup, so on and so forth. What I don't understand is why, I run SLI with the exact same windows install and exact same components, take the machine apart, put it back together, then it stops working. What the hell?
     
    Last edited: 16 May 2012
  3. Chicken76

    Chicken76 Minimodder

    Joined:
    10 Nov 2009
    Posts:
    952
    Likes Received:
    32
    Can you try running the motherboard out of the case? If so, don't connect any case buttons, just short the power-button pins with a screwdriver to turn it on and off.

    I'm interested to see how/if this problem gets solved, as I'm siting on a board very similar to yours (990XA-UD3) and intend to go SLI at some point, too.

    And as I've said before, see if you can convince a friend to lend you his power supply for a brief test. This kind of behavior could very well be caused by a faulty power supply. Anything above 600W should do, if it's a good brand and you don't overclock the CPU.
     
  4. dunx

    dunx ITX is where it's at !

    Joined:
    1 Sep 2010
    Posts:
    463
    Likes Received:
    13
    Put your 2nd GPU in the x4 slot and try that...

    dunx
     
  5. Chicken76

    Chicken76 Minimodder

    Joined:
    10 Nov 2009
    Posts:
    952
    Likes Received:
    32
    I don't think you can run SLI on that slot.
     
  6. .//TuNdRa

    .//TuNdRa Resident Bulldozer Guru

    Joined:
    12 Feb 2011
    Posts:
    4,046
    Likes Received:
    109
    Well I tried SLI without the bridge in place and it just basically told me to jog on and put the bridge in place, refused to let me activate SLI without the bridge.

    I'll give it a spin outside of the case once I've worked out how i'd do that. I'm currently digging around in windows, because when SLI was working; It hadn't quite installed all the drivers for most of my USB devices. Reboot and the drivers are there, and suddenly it stops working. I'm currently looking to see if that's the cause of the fault.
     
  7. Chicken76

    Chicken76 Minimodder

    Joined:
    10 Nov 2009
    Posts:
    952
    Likes Received:
    32
    Hmm, there could be an odd IRQ conflict there with your USB controllers, but it's not very likely. Try disabling anything that's not essential, like unneeded USB controllers (all of them if you don't have USB keyboard and mouse), parallel and serial ports, firewire, etc. Even sound and Lan for a test...
     
  8. law99

    law99 Custom User Title

    Joined:
    24 Sep 2009
    Posts:
    2,390
    Likes Received:
    63
    Isn't there a setting in the bios that can clear all those settings? I forget because caring about irq is before my time. Plus I don't know whether Windows also controls these as I've been lucky enough to never have a conflict.
     
  9. .//TuNdRa

    .//TuNdRa Resident Bulldozer Guru

    Joined:
    12 Feb 2011
    Posts:
    4,046
    Likes Received:
    109
    I had a dig through my drivers, disabled the eight (!) high definition audio devices I had in there, considering my Xonar DX and "Audio_Device" are all I need. (Soundcard and USB webcam/Microphone). Then, on a whim, dug out another set of cables for my Power Supply, so each GPU has two cables going to it, instead of a single cable that ends in two six-pin connectors. - My reasoning for this is that the Silverstone Strider Gold I have is a tiny unit, for a 1000W at least, so there's actually Capacitors embedded within the PCI-E cables to help keep the power curve smooth. I figured trying two sets of caps per card instead of one should help it out a little.

    Started up, seems to have worked okay, driver crashed when starting Heaven benchmark, even though Furmark didn't touch it (?), so I upped GPU voltage from 1137mv to 1150mv and it appears to be stable so far. I've not restarted the machine yet, so It's entirely possible it could all go to hell still.]

    Ethernet LAN is disabled in the Bios, as is Onboard Audio and the Onboard Serial Controller. USB needs to be enabled because I'm running a crapton of stuff off it (G13, Keyboard, Mouse, Webcam, 360 controller, Wireless Card), I'll keep testing, but the cards seem to max out at 63 degrees a piece without crashing now, so long as I ensure the air in the room doesn't get too warm. I think, touch wood, it's mostly stable for the second.

    I just can't ever reboot my PC, ever, for fear of it not working. Hrm.

    Edit; So I've now managed to dump it to sleep mode and back up without explosions or failure, so it's a good sign, usually my PC will just outright crap bricks if use sleep mode and something's unstable.
     
    Last edited: 17 May 2012
  10. Deders

    Deders Modder

    Joined:
    14 Nov 2010
    Posts:
    4,053
    Likes Received:
    106
    I had a similar issue a couple of years back, was just after installing a beta driver, restored the old driver and it was fine again.
     
  11. .//TuNdRa

    .//TuNdRa Resident Bulldozer Guru

    Joined:
    12 Feb 2011
    Posts:
    4,046
    Likes Received:
    109
    I'm using the current 301 beta drivers because they actually let me enable SLI without instantly driver-crash looping the PC. Plus they haven't broken anything yet, so far as I can see, so it looks like I'm okay. I've tried the 295, 297 and 301s when this issue was ongoing all had the same issue. I can't go to anything much older than those because then I lose AMD SLI on my motherboard.
     
  12. Deders

    Deders Modder

    Joined:
    14 Nov 2010
    Posts:
    4,053
    Likes Received:
    106
    What drivers were you using before all this started?
     
  13. Chicken76

    Chicken76 Minimodder

    Joined:
    10 Nov 2009
    Posts:
    952
    Likes Received:
    32
    So you're not going to shut down your PC ever? That's hardly a good solution to your problem.

    OFFTOPIC: How do you find your GTX550Ti's in SLI (when they work)? Are they worth it? What frequencies do you use (core/ram)?
     
  14. .//TuNdRa

    .//TuNdRa Resident Bulldozer Guru

    Joined:
    12 Feb 2011
    Posts:
    4,046
    Likes Received:
    109
    I'll shut it down, but I'll be somewhat concerned when I do.

    The GPUs are stock clocks from KFA2, of course, these are the "White Edition" LTD OC cards, so "Stock clocks" are 1000mhz on the core, 2000mhz on the shaders, and 2300mhz (4600mhz effective) on the memory. Default voltage on the cards is 1.137V, the next step up from that is 1.150 which it seems to need to be stable in SLI in some scenarios. I've got the bios on both flashed for unlocked Voltage limit, but I shalln't use it in SLI. One card has memory that'll go to 2500mhz without caring, but the core doesn't go too far, while the other overclocks really damn well on the core, but the memory won't budge. - Side effect is that even a 1mhz bump in memory in SLI crashes the card. Not worth it in my opinion.

    The actual performance is a nice bump. The cards only get a few degrees hotter than when I run them seperately, while I can near enough max out most games that I play (Exception being Crysis and Metro 2033.), at 22.5" 1080p; I don't see jaggies that much, so I need, at most, 4X AA, that's mostly where the extra GPU horsepower is going, plus upped shadows sometimes. Admittedly; Skyrim can drop it down to single-digits in FPS, but I think that's a game-engine issue with the amount of crap Magic can use up, most of the time it stays over 60. Performance, overall, is roughly equivalent to a stock 570, although I'm slightly faster than a stock 570 without AA, slower with, interesting balance right there.

    Deders; The drivers in use before I tried these were the 296 WHQL's, they had the same issue. I think it was my windows install being a little beech. Seem to be okay for the second, though.
     
    Last edited: 17 May 2012
  15. Chicken76

    Chicken76 Minimodder

    Joined:
    10 Nov 2009
    Posts:
    952
    Likes Received:
    32
    Thanks for sharing your experience with them.
    How do you find them noise-wise?
     
  16. kenco_uk

    kenco_uk I unsuccessfully then tried again

    Joined:
    28 Nov 2003
    Posts:
    10,107
    Likes Received:
    682
    I'm assuming you've tried it with the non-essentials unplugged? I.e. just keyboard and mouse.
     
  17. .//TuNdRa

    .//TuNdRa Resident Bulldozer Guru

    Joined:
    12 Feb 2011
    Posts:
    4,046
    Likes Received:
    109
    The cards get a bit loud, but that's mostly because it's a 60mm fan on each that'll hit 3000+ Rpm if needed. Mostly they get to a low whine, Sometimes above that, but it depends on if I up the case fans too.

    Finally worked up the nerve to restart, Bit nerve-wracking, bootup took three times as long as normal, and the entire screen froze up a few times during logon, but it appears to have sorted itself out now.

    I've not put any 3D load on it, however, so I'll give it a quick blast in Furmark and see what happens...

    Edit; Nope. It's all gone to crap. Five GPU crashes and counting, Furmark didn't even start.

    Still getting the random freezing, usually whenever I move the mouse.

    I think i'm going to have to DIsable SLI before I shut it down, enable it on startup, if it tries to start up with SLI enabled; it all seems to go to heck.
     
  18. Deders

    Deders Modder

    Joined:
    14 Nov 2010
    Posts:
    4,053
    Likes Received:
    106
    Are they custom KFA2's? like having extra outputs or something? I think you're only supposed to use their drivers in these cases. Since this started to happen have you tried reverting back to KFA2's drivers?
     
  19. .//TuNdRa

    .//TuNdRa Resident Bulldozer Guru

    Joined:
    12 Feb 2011
    Posts:
    4,046
    Likes Received:
    109
    These are custom PCB cards, but they're stock Nvidia hardware, just bumped up VRM phases and a bitchin' White PCB.

    Anyway; After two restarts and a deadly knife-fight; i've gotten into windows with SLI disabled for the minute. It's going to be a pain in the ass, but I think I can live like this until I solve the issue with bootup and SLI.
     
  20. Deders

    Deders Modder

    Joined:
    14 Nov 2010
    Posts:
    4,053
    Likes Received:
    106
    One thing I did notice when looking through my IRQ's was that when SLI was enabled, both cards shared the same IRQ. Am pretty sure they were different when SLI was disabled.
     

Share This Page