While working on RAM and Motherboard products in our labs, we have always relied heavily on Memtest86+ and Prime95. Memtest86+ was coded by Samuel Demeulemeester from somewhere near Paris, France. We recently exchanged thoughts about memory problems and Memtest86+ 2.0 milestone release. It has always been painfully difficult to get good answers from RAM and Motherboard companies due to PR bias and marketing BS. Even those working in technical labs, we have difficult times in getting good no non-sense answers. Only through an independent 3rd party lab we identify the weaknesses in products. The interview with Samuel was different as it is void of the above mentioned stuff-from-the-wazoo. Hope you guys (and gals) could take something away from the interview * * * * * * * * * * * * * * * Take note: Even though you could run Memtest86+ on a Mac during boot time, i discovered the iMac, Macbook, Macbook Pro and the Air firmware is wacky and will cause Memtest86+ to show the RAM (0 to 1MB) to be in error. That is a false result. You should not worry about it unless error zone is above 1MB. * * * * * * * * * * * * * * * RJL: Can you tell us a bit about yourself and what you do? SD: I live somewhere near Paris, France. Right now, I’m the Hardware Chief Editor of a new French website called www.canardplus.com. We deal with videogames and highly technical hardware reviews. Some of your readers perhaps remember my old website called x86-secret.com. About coding, I’m the developer of Memtest86+ and the online part of the well-known CPU-Z (Validator, statistics & more). RJL: Can you briefly describe what Memtest86+ does? SD: Memtest86+ is used to detect bad memory modules. It reads and writes every bit of your computer’s memory many times and in many different ways in order to detect failures related to the memory sub-system. RJL: What did you try to solve by porting Memtest86 (written by Chris Brady) to memtest86+? SD: Chris was the original developer of the Memtest86 Core. But probably due to some lack of time, the original Memtest86 wasn’t updated since many years when I decided to launch Memtest86+. As I had access to the latest Hardware, I’ve first added support for many new CPU and Chipsets and then, added some Core enhancements and bug fixes. RJL: When did you start working on version 2.0 and how long did it take to reach it? SD: I started to work on V2.0 on October 2007. It took about 3 months to be ready. Beta-testing is the more time-consuming part of the process because Memtest86+ is included on many Linux distributions and it must remain as bug free as possible. But as you know, it’s just impossible to prevent all bugs. That’s why Memtest86+ v2.01 was released two weeks after the launch of V2.00. RJL: What were the major architecture changes in version 2.0? SD: Memtest86+ V2.0 now includes rewritten tests that use random patterns instead of fixed patterns. Algorithms used by Modulo test were improved in order to provide more thorough testing. And the architecture of Memtest86+ (i.e. the way the code source is written) has evolved to be more readable and comprehensive. It was something essential for future enhancements. RJL: Are there any major upcoming changes beyond 2.0 you like to talk about? SD: Not yet, but monitoring will probably be extended to another level. For example, we already have a Beta version that reports FB-DIMM Module Temperature and reports DDR Voltages. RJL: Do you use large amount of modules and motherboards during software testing? SD: Yes ! Beta versions of Memtest86+ are tested on numerous hardware. I have about one hundred motherboards staying here for tests and lots of memory modules from many brands. Some of them are faulty modules specially manufactured to produce error on a specific bit. Really useful for debugging purposes! I also check many different size of memory. For example, at this time, I can tell you that Memtest86+ is only able to recognize 56 GB of system memory on a 64 GB server. Thanks for the support from many big companies! RJL: RJL: What type of errors can Memtest86+ detect? SD: Memtest86+ is able to detect standard Memory Errors (when “1” become “0” or vice-versa due to a bad chip), but also parity errors and ECC errors. ECC is used on servers and workstations to auto-correct memory errors. Memtest86+ is able to detect if an error occurred, even if it was corrected on the fly. But it can also detect errors on the memory sub-system caused by overclocking. RJL: During my work validating large amount of memory modules and motherboard, we found Memtest86+ 1.70 took a long time to detect errors when compared to Prime95. Was this due to limitation of pattern analysis? SD: Prime95 use a workload that is able to detect failure coming from different locations of a computer, not only memory. BTW, if an error occurs with Prime95, you can’t be sure from where it comes from. It may come from the link between the CPU and the Memory Controller (FSB), from the CPU, from the memory itself or from something else. Memtest86+ algorithms try to focus on memory only. Pure memory-related issues should be detected faster with Memtest86+, especially with v2.00. RJL: Does Memtest86+ 2.0 stress and saturate the memory bus during testing? SD: Not on all test. For example, Modulo X (test #8) is much more memory bus intensive than Moving Inversion tests. The goal is to make a distinction between FSB/chipset failures and pure memory failures. A defective memory module will be detected even if the memory bus is not saturated. In the opposite, a highly stressed chipset can generate errors. So, if you only have errors on Modulo X, it may come from the chipset instead of the memory modules. RJL: Did you see more memory errors as we move into faster DDR speed grades? SD: I see more memory errors, but that’s not due to faster DDR Speed Grades. JEDEC only ratifies a new speed grade when memory makers are able to produce them in high quantities and with optimum stability. Look at question #20 for more. RJL: Based on your experience, what are memory instability symptoms? SD: Many users think that memory instability always leads to catastrophic failures like BSOD or data corruptions. But symptoms are often less noticeable. It may be some crash with the famous “xxxx has encountered an error and must close” or games that will return back to desktop without any error messages. If something like that happens, check your memory! RJL: Should users at home test their memory for stability before using a new computer? SD: Yes! It’s only an hour (or two) to spend to be sure that you will not loose some data later. RJL: How many hours should we let Memtest86+ runs before regarding the memory system as stable? SD: That’s a good question. With a single pass, confidence level is high, perhaps 95%. If you want to be 99.9% sure, let it run for one night. RJL: Besides the default startup testing features in Memtest86+ 2.0, are there any other features user could tweak? SD: If you’re not an expert, Memory Error Reporting Mode (ERM) is a setting you should check. The default ERM on Memtest86+ 2.0 displays some red lines with many technical information’s like address, offset, and defective bits. You can set ERM to “Summary” on the configuration menu in order to display much more comprehensives data. RJL: RJL: Will there be any future plans to take advantage of multi-core CPU by multi-threading Memtest86+ to speed up the tests? How simple or complex is it? SD: It’s really, really complex because Memtest86+ is a low level tool and you must do all the things without the help of an operating system. For example, you must initialize all the CPU yourself with low level assembler code. And a multi-threaded code will not speed up memory accesses. The only advantage of doing that is on CPUs with integrated Memory Controller like AMD K8, K10 and the upcoming Nehalem from Intel. You’re able to check ECC errors generated by each cores. I have an alpha code for AMD K8/K10 but it still need many improvements. RJL: Is there possibility of increasing the stress level during testing with Memtest86+? SD: Stress level is already high, but as usual, you can over-spec your memory with lower voltage or more aggressive timings. RJL: Can users at home use Memtest86+ to test for stability when overclocking their computers? SD: Yes, because memory subsystem is the first to generate instabilities on overclocked computers. RJL: What other software can you recommend to Overclockers besides working solely with Memtest86+? SD: Useful tools are: Prime95, CPU Stress MT, Intel Thermal Analysis Tool, 3DMark (for GPU load) and CPU-Z. RJL: Go Google them RJL: Is there anything I haven't asked that you think is important or worth talking about? SD: Yes! Practices from Mobo Makers are something I must talk about. As you may know, mobo markers now include lots of “performance” enhancements inside their BIOS in order to win 1% or 2% on benchmarks compared to their competitors. It could be some kind of automatic overclocking or timings adjustments. Now, many BIOS auto-adjust memory settings far beyond specs, even at default profile, and without asking anything to the user. For example, I found that many new motherboards are just unstable at default settings with standard memory. As soon as you disable “Performance Mode”, “Memory Boost”, or “Turbo Mode” (or whatever they called it), everything works just fine. Asus is perhaps the worst of them, as they increase dramatically the memory voltage in order to maintain a relative stability. That’s the best way to reduce the lifetime of memory. It’s a real shame. * * * * * * * * * * * * * * *