Memory memtest addresses puzzle

Discussion in 'Tech Support' started by Stephen Brooks, 6 Apr 2010.

  Stephen Brooks

    Stephen Brooks New Member

    24 Apr 2005
    Likes Received:
    I'm trying to get an old server system from Tyan working, 8 sockets Opteron 875, 2GB of RAM for each CPU in a single stick, using the first of four available slots in each case.

    It's had some crashing problems and Memtest86 has turned up a bunch of errors. We've got it error-free by removing the top sub-board (so now it's just 4 socket machine) and then half the RAM on the remaining layer, so 4 CPUs, but only 2x 2GB DDR3200 (ECC, registered) modules.

    Now we're adding various third RAM modules into the remaining two CPUs' slots (only using the same first slot in the bank of four per CPU). Usually memtest only finds one error and it's got an address like 13FFDxxxx or 17FFxxxxx, i.e. always very near the end of a 1GB block.

    My question is: is this likely to be a fault with the RAM, or does the non-randomness of these addresses suggest it's a fault in the memory controller on one of the CPUs?

    [Details: One RAM stick tried with both remaining CPUs got an error, as did a second module in one position (we're still testing other combinations and might use yet more RAM later).]

    Thanks for any help or suggestions you might have! (Don't say "buy a new server", I know the new CPUs are faster but this is what we've got and there's a freeze on IT expenditure)

