I've just been running Unigine Heaven benchmark, and twice now, it's crashed. Well.. I say crashed, but what happens is the computer just resets... as if I've pressed the reset button. Assuming it was just a slight instability with my overclock, I reset the BIOS to default values, and ran it again at stock settings. Same thing happens.. BANG.. reset. Any ideas? Rig specs in my sig. The power supply is a Gigabyte Odin 800W. It also does it with the unigine tropics bench, and after some testing, had it do it with HL2, so it's not as if the games are just stressing anything out too much. Power problem? The PSU doesn't seem particularly stressed, or hot though. Memtest passes OK, as does Prime 95.
I've never had LinX work in Win7, but Intel Burn test does.. however, it always reports instability after the second test, no matter what I do.. even running at stock settings, it always fails at the same place, so I suspect a bit dodgy as well. Stressing again with Prime 95 now at my usual OC settings. I suspect it will be OK though. Will report back later.
yeah, for some reason linx/ibt seem to find stuff that prime doesnt - how do you know its your cpu not the gfx card?
I don't... but I thought I'd rule things out one at a time. It's been stressing in Prime 95 for 3 hours now... no problems, so I'll assume it's OK. I've changed nothing since I last stressed it. My temps were 86 degrees when I just got back now... but it was also 27 degrees in this room! .. so that's probably about right. [EDIT] Just ran Heaven again with rig reset to stock... and it reset again when the GPU started to get stressed... GPU temp got to 90 degrees, then it reset. That's not THAT hot though for a 295. Somehow... it feels like a power problem to me... I wish I had a spare PSU. [edit of edit] Re-ran Heaven with the auto fan speed disabled, so it's in constant leaf blower mode... temps never got above 70 degrees... still reset. It's all pointing to a power problem I think... opinions?
Resets under stress for no obvious reasons usually point to power problems, yes. Try to get a PSU from a friend or something to replace your old one and test your system again. Don't go out buying a new one based on an assumption. A friend of mine did that some time ago and ended up replacing half of his PC without finding the problem
maybe try swapping the PCI-e connectors for ones on another rail? heres a random one, in the BIOS change the PCI-e freq to 101 and see if that goes stable.
I don't know anyone with a decent PSU I could borrow unfortunately. The Server PSU, while decent, is not really powerful enough. I'll try the PCI-E clock bump later... I've got loads of stuff running/need to do at the moment. I will report back later.
Has your BIOS got the option "PCI-e Payload Size", I had to change mine from the default of 128, all the way to the maximum of 4096 to get mine to boot. Another thought, if your PSU doesn't have a different set of PCI-e power connectors, you could try and use the molex -> 8 pin connectors that should have come in the graphics cards box. That will force your PSU to give power on a different rail Finally you could try physically swapping the card to your motherboards other PCI-e slot. Oh and everyones jumping to hardware solutions (which it probably is), but the 295 has an internal SLi bridge, and without software controlling it correctly obviously it would go pop. I assume a simple driver re install has been tried? And what about IRQ conflicts? Maybe the card isn't being allocated enough resources by windows, but the bug doesn't manifest itself until its under load and needs them. Any recent new USB devices to the system. Try stripping it off all peripherals except keyboard and mouse, and try again. Then try stripping it of all non essential PCI cards too, so it can have just the GPU, ruling out anything else conflicting/causing it. None of those sound particularly convincing, but what I would do in that situation
It's already at 4096.. no idea why I did that now, but I did it ages ago. I will try a different rail later. I need to strip out my wiring loom to do that, and don't have time this afternoon. Will report back about that one. There are 2 separate PCI sockets on the Odin, and so far as I know they use separate rails. I will try that when I remove the loom to switch rails. All driver/software is up to date. I tried rolling back on drivers to no avail, so put it back to the latest one. The only new USB devices are on my G110 keyboard. If it's that I don't see what I can do, because I'm not sure I can disable them can I? I need the G110s USB sound device for headphone and mic use, as it's the only practical way to cue audio with my DJ software. If it is that I'm in trouble. Personally, I can't see it being this... but will try it later just to be certain. No.. you're right. You have to eliminate all possibilities. I think a rail switch may cure it... if not, then my money is on a PSU problem.. which is annoying, as 800W should be plenty... and I bought this on a Bit Tech recommendation. It does have a 3yr warranty though, so can RMA it still. Just to be super certain, I grabbed the latest version of IntelBurn test, which now works in Win7, and it tested absolutely fine for hours. The CPU/Memory/over clock is rock solid.. it's definitely GPU/Power related.
Oh you would be amazed. A few builds back it wouldn't boot from day 1. I tried everything I could think of, and after about a week of testing I pinned it down to be the DVD drive. Never did find out why or what was wrong, but it shows that even the most inconspicuous things can bring a build down. Good luck anyway man
Is there an easy to check for resource conflicts without resorting to unplugging stuff? I've not had to do this since early Win95 days to get my AWE32 sound card to work Showing my age now
I'm not getting paid for this advert but a direct quote from Tweak Guides Windows 7 System Guide which was possibly the best couple of quid I ever spent: ACPI is the Advanced Configuration and Power Interface standard, and is an important part of the way Windows and drivers communicate with your hardware. In versions of Windows prior to Windows 7 and Vista you could run hardware which didn't support ACPI, or even disable ACPI if you wanted to attempt manual resource allocation. However this is no longer possible as of Windows Vista - Vista and 7 require ACPI for hardware to function. That means that you cannot disable ACPI, and older hardware which is not properly ACPI-Compliant will not run on Windows 7. Only systems based on motherboards whose BIOS is ACPI Compliant and dated 1 January 1999 or newer can be used. If you're running older hardware this means you should update to the latest available BIOS for your motherboard and also ensure that any ACPI options are enabled for Windows to install and run without problems. Windows 7 does not fundamentally change the way resources are handled compared to previous versions of Windows. Since Windows 7 only accepts ACPI-compliant systems, and because most recent hardware supports Plug and Play functionality, resource allocation is handled automatically and quite efficiently and should not be a major issue. However one practical aspect of ACPI is covered below. Interrupt Requests (IRQs) are the way in which all of your major system devices get the CPU's attention for instructions/interaction as often as necessary. There are usually 16 - 24 main hardware IRQs available in a modern PC, and these are usually assigned to individual components or hardware functions. To view your current IRQ allocation open Device Manager and under the View menu select 'Resources by Type', then expand the 'Interrupt Request (IRQ)' item. You will see all the devices currently active on your PC, with the IRQ number showing as the number in brackets, e.g. IRQ 0 is shown as (ISA) 0x00000000 (00) System Timer. While you may see IRQs numbered up to 190 or more, all of the IRQ numbers above 24 are for legacy Industry Standard Architecture (ISA) or non-Plug and Play devices, not for your main system hardware, so the key IRQs to examine are those numbered up to 24. For an easier method of viewing IRQs and checking for potential IRQ conflicts, use the built-in System Information tool (see the System Specifications chapter). To access it go to Start>Search Box, type system information and press Enter. Expand the 'Hardware Resources' item in the left pane, and click the IRQs item to see IRQs listed in order from 0 upwards. Click the 'Conflicts/Sharing' item to see a summary of sharing conflicts. Don't panic if you see conflicts, this doesn't mean your system is unstable or configured incorrectly. In many cases some hardware will be sharing a single IRQ or resource and there's not much you can do to prevent or alter this, it is normal behavior. Window allows several devices to share an IRQ without any major issues, and in general this should be fine. However in cases where two or more high-performance components, such as your graphics card, sound card, or Ethernet controller are sharing a single IRQ, this may be a source of potential problems. High performance hardware is best on its own IRQ, but unfortunately you can't alter the IRQ allocations from within Windows, as they are automatically handled by ACPI. Only legacy devices will have the option to attempt manual alteration of their resources under the Resources tab of the relevant device Properties in Device Manager; most other devices do not allow the 'Use automatic settings' option to be unticked. The only ways to prevent or minimize the impact of IRQ sharing are: Disable unused devices - Covered in more detail further below, disabling unused devices in the BIOS and in Device Manager is a way of reducing unnecessary resource usage and speeding up boot time, and also preventing IRQ sharing-related problems. This is best done first in the BIOS prior to installing Windows. Move Conflicting Devices - On an existing installation of Windows 7 you can attempt to reduce IRQ sharing by moving a device. Physically move one of the items to another location on your system if possible, such as shifting a sound card from one PCI/PCI-E slot to another, or if a USB Host Controller is sharing with a major device, avoid plugging any USB device into the specific USB hub that controller relates to. If neither of the shared devices can be physically moved then you will have to accept the situation. Remember that Windows can share IRQs without major problems in most cases. If after the above procedures you still have difficulties or reduced performance which you feel are attributable to IRQ sharing, the final option is to reformat and reinstall Windows 7, first making sure of course to first correctly configure your BIOS and disable all unnecessary devices. Even then there is no guarantee that major devices won't wind up being shared again. Unlike previous versions of Windows, you cannot disable ACPI to force manual IRQ allocation, as Windows 7 must have ACPI enabled to work properly. Haha sorry, bit wall of text'y but covers everything
No IRQ conflicts for the GPUs. There are lots of shares, but that's normal I'm assuming. Each GPU has it's own IRQ though. I'll do more work on this later in the week, as I won't have time tomorrow. Thanks for the help so far.
Fixed. It was a power problem. The Odin PSU has 2x hardwired PCI-E connectors as part of the fixed loom. Obviously I used these as why have spare leads hanging around making it harder to have a clean install, right? Well.. no, they're obviously on a shared rail, as this is what was causing my problem. GPUs now powered from the dedicated red and blue PCI-E sockets, and it's absolutely fine. Thanks to everyone who tried to help with this. Appreciated. I can play games again now! Why it SUDDENLY started doing this I have no idea... but it's fixed anyway.