bit-tech.net

Old 15th Jan 2013, 13:11   #1
Scorpuk
Supermodder
 
Join Date: Jan 2012
Location: North Ayrshire, Scotland
Posts: 502
Scorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run Crysis
Folding problem on Opteron

Just noticed today that F@H has started to hang.

I tired restarting it, but it eventually hangs.

I then deleted the work folder and started again, but then it hangs again. (Done this twice and on the second time went for -smp 32)

Once when HFM said it was hung I left it alone for over an hour and it then came back with the tpf jumping from 15m to 57m.


Here is my current log file:

Code:
--- Opening Log file [January 15 08:46:16 UTC] 


# Linux SMP Console Edition ###################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/john/fah
Executable: ./fah6
Arguments: -smp -bigadv -verbosity 9 

[08:46:16] - Ask before connecting: No
[08:46:16] - User name: scorpuk (Team 35947)
[08:46:16] - User ID: 69B84E3D5AC5DB27
[08:46:16] - Machine ID: 1
[08:46:16] 
[08:46:16] Loaded queue successfully.
[Jan46:1615 
[08:46:16] - Autosending finished units... [Jan46:1615 08:46:16 UTC]
[08:46:16] + Processing work unit
[08:46:16] Trying to send all finished work units
[08:46:16] Core required: FahCore_a5.exe
[08:46:16] + No unsent completed units remaining.
[08:46:16] - Autosend completed
[08:46:16] Core found.
[08:46:17] Working on queue slot 06 [January 15 08:46:17 UTC]
[08:46:17] + Working ...
[08:46:17] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 06 -np 64 -checkpoint 15 -verbose -lifeline 2220 -version 634'

[08:46:17] 
[08:46:17] *------------------------------*
[08:46:17] Folding@Home Gromacs SMP Core
[08:46:17] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[08:46:17] 
[08:46:17] Preparing to commence simulation
[08:46:17] - Looking at optimizations...
[08:46:17] - Files status OK
[08:46:22] - Expanded 30302149 -> 33158020 (decompressed 109.4 percent)
[08:46:22] Called DecompressByteArray: compressed_data_size=30302149 data_size=33158020, decompressed_data_size=33158020 diff=0
[08:46:22] - Digital signature verified
[08:46:22] 
[08:46:22] Project: 8101 (Run 6, Clone 0, Gen 215)
[08:46:22] 
[08:46:22] Assembly optimizations on if available.
[08:46:22] Entering M.D.
[08:46:28] Using Gromacs checkpoints
[08:46:32] Mapping NT from 64 to 64 
[08:48:56] Resuming from checkpoint
[08:48:58] Verified work/wudata_06.log
[08:48:59] Verified work/wudata_06.trr
[08:48:59] Verified work/wudata_06.xtc
[08:48:59] Verified work/wudata_06.edr
[08:49:01] Completed 116240 out of 250000 steps  (46%)
[09:23:30] ***** Got an Activate signal (2)
[09:23:30] Killing all core threads

Folding@Home Client Shutdown.


--- Opening Log file [January 15 09:23:47 UTC] 


# Linux SMP Console Edition ###################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/john/fah
Executable: ./fah6
Arguments: -smp -bigadv -verbosity 9 

[09:23:47] - Ask before connecting: No
[09:23:47] - User name: scorpuk (Team 35947)
[09:23:47] - User ID: 69B84E3D5AC5DB27
[09:23:47] - Machine ID: 1
[09:23:47] 
[09:23:47] Work directory not found. Creating...
[09:23:47] Loaded queue successfully.
[09:23:47] 
[09:23:47] + Processing work unit
[09:23:47] Core required: FahCore_a5.exe
[09:23:47] - Autosending finished units... [January 15 09:23:47 UTC]
[09:23:47] Core found.
[09:23:47] Trying to send all finished work units
[09:23:47] + No unsent completed units remaining.
[09:23:47] - Autosend completed
[09:23:47] Working on queue slot 06 [January 15 09:23:47 UTC]
[09:23:47] + Working ...
[09:23:47] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 06 -np 64 -checkpoint 15 -verbose -lifeline 4103 -version 634'

[09:23:47] 
[09:23:47] *------------------------------*
[09:23:47] Folding@Home Gromacs SMP Core
[09:23:47] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[09:23:47] 
[09:23:47] Preparing to commence simulation
[09:23:47] - Looking at optimizations...
[09:23:47] - Created dyn
[09:23:47] - Files status OK
[09:23:47] Error: Missing work file=<>
[09:23:47] 
[09:23:47] Folding@home Core Shutdown: MISSING_WORK_FILES
[09:23:48] CoreStatus = 74 (116)
[09:23:48] The core could not find the work files specified. Removing from queue
[09:23:48] Deleting current work unit & continuing...
[09:23:48] Trying to send all finished work units
[09:23:48] + No unsent completed units remaining.
[09:23:48] - Preparing to get new work unit...
[09:23:48] Cleaning up work directory
[09:23:48] + Attempting to get work packet
[09:23:48] Passkey found
[09:23:48] - Will indicate memory of 64426 MB
[09:23:48] - Connecting to assignment server
[09:23:48] Connecting to http://assign.stanford.edu:8080/
[09:23:49] Posted data.
[09:23:49] Initial: 8F80; - Successful: assigned to (128.143.231.201).
[09:23:49] + News From Folding@Home: Welcome to Folding@Home
[09:23:49] Loaded queue successfully.
[09:23:49] Sent data
[09:23:49] Connecting to http://128.143.231.201:8080/
[09:23:58] Posted data.
[09:23:58] Initial: 0000; - Receiving payload (expected size: 30302661)
[09:24:20] - Downloaded at ~1345 kB/s
[09:24:20] - Averaged speed for that direction ~1273 kB/s
[09:24:20] + Received work.
[09:24:20] + Closed connections
[09:24:25] 
[09:24:25] + Processing work unit
[09:24:25] Core required: FahCore_a5.exe
[09:24:25] Core found.
[09:24:25] Working on queue slot 07 [January 15 09:24:25 UTC]
[09:24:25] + Working ...
[09:24:25] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 07 -np 64 -checkpoint 15 -verbose -lifeline 4103 -version 634'

[09:24:25] 
[09:24:25] *------------------------------*
[09:24:25] Folding@Home Gromacs SMP Core
[09:24:25] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[09:24:25] 
[09:24:25] Preparing to commence simulation
[09:24:25] - Looking at optimizations...
[09:24:25] - Created dyn
[09:24:25] - Files status OK
[09:24:29] - Expanded 30302149 -> 33158020 (decompressed 109.4 percent)
[09:24:29] Called DecompressByteArray: compressed_data_size=30302149 data_size=33158020, decompressed_data_size=33158020 diff=0
[09:24:30] - Digital signature verified
[09:24:30] 
[09:24:30] Project: 8101 (Run 6, Clone 0, Gen 215)
[09:24:30] 
[09:24:30] Assembly optimizations on if available.
[09:24:30] Entering M.D.
[09:24:37] Mapping NT from 64 to 64 
[09:24:44] Completed 0 out of 250000 steps  (0%)
[09:40:01] Completed 2500 out of 250000 steps  (1%)
[10:17:20] Completed 5000 out of 250000 steps  (2%)
[12:17:00] Completed 7500 out of 250000 steps  (3%)
[12:40:35] ***** Got an Activate signal (2)
[12:40:35] Killing all core threads

Folding@Home Client Shutdown.


--- Opening Log file [January 15 12:40:42 UTC] 


# Linux SMP Console Edition ###################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/john/fah
Executable: ./fah6
Arguments: -smp 32 -bigadv -verbosity 9 

[12:40:42] - Ask before connecting: No
[12:40:42] - User name: scorpuk (Team 35947)
[12:40:42] - User ID: 69B84E3D5AC5DB27
[12:40:42] - Machine ID: 1
[12:40:42] 
[12:40:42] Loaded queue successfully.
[12:40:42] 
[12:40:42] + Processing work unit
[12:40:42] Core required: FahCore_a5.exe
[12:40:42] - Autosending finished units... [January 15 12:40:42 UTC]
[12:40:42] Core found.
[12:40:42] Trying to send all finished work units
[12:40:42] + No unsent completed units remaining.
[12:40:42] - Autosend completed
[12:40:42] Working on queue slot 07 [January 15 12:40:42 UTC]
[12:40:42] + Working ...
[12:40:42] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 07 -np 32 -checkpoint 15 -verbose -lifeline 7580 -version 634'

[12:40:42] 
[12:40:42] *------------------------------*
[12:40:42] Folding@Home Gromacs SMP Core
[12:40:42] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[12:40:42] 
[12:40:42] Preparing to commence simulation
[12:40:42] - Looking at optimizations...
[12:40:42] - Files status OK
[12:40:46] - Expanded 30302149 -> 33158020 (decompressed 109.4 percent)
[12:40:46] Called DecompressByteArray: compressed_data_size=30302149 data_size=33158020, decompressed_data_size=33158020 diff=0
[12:40:47] - Digital signature verified
[12:40:47] 
[12:40:47] Project: 8101 (Run 6, Clone 0, Gen 215)
[12:40:47] 
[12:40:47] Assembly optimizations on if available.
[12:40:47] Entering M.D.
[12:40:53] Using Gromacs checkpoints
[12:40:56] Mapping NT from 32 to 32 
[12:41:39] Resuming from checkpoint
[12:41:40] Verified work/wudata_07.log
[12:41:40] Verified work/wudata_07.trr
[12:41:41] Verified work/wudata_07.xtc
[12:41:41] Verified work/wudata_07.edr
[12:41:41] Completed 7980 out of 250000 steps  (3%)
[12:42:42] ***** Got an Activate signal (2)
[12:42:42] Killing all core threads

Folding@Home Client Shutdown.


--- Opening Log file [January 15 12:42:53 UTC] 


# Linux SMP Console Edition ###################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/john/fah
Executable: ./fah6
Arguments: -smp 32 -bigadv -verbosity 9 

[12:42:53] - Ask before connecting: No
[12:42:53] - User name: scorpuk (Team 35947)
[12:42:53] - User ID: 69B84E3D5AC5DB27
[12:42:53] - Machine ID: 1
[12:42:53] 
[12:42:53] Work directory not found. Creating...
[12:42:53] Loaded queue successfully.
[12:42:53] 
[12:42:53] - Autosending finished units... [January 15 12:42:53 UTC]
[12:42:53] + Processing work unit
[12:42:53] Trying to send all finished work units
[12:42:53] Core required: FahCore_a5.exe
[12:42:53] + No unsent completed units remaining.
[12:42:53] Core found.
[12:42:53] - Autosend completed
[12:42:53] Working on queue slot 07 [January 15 12:42:53 UTC]
[12:42:53] + Working ...
[12:42:53] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 07 -np 32 -checkpoint 15 -verbose -lifeline 7672 -version 634'

[12:42:53] 
[12:42:53] *------------------------------*
[12:42:53] Folding@Home Gromacs SMP Core
[12:42:53] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[12:42:53] 
[12:42:53] Preparing to commence simulation
[12:42:53] - Looking at optimizations...
[12:42:53] - Created dyn
[12:42:53] - Files status OK
[12:42:53] Error: Missing work file=<>
[12:42:53] 
[12:42:53] Folding@home Core Shutdown: MISSING_WORK_FILES
[12:42:53] CoreStatus = 74 (116)
[12:42:53] The core could not find the work files specified. Removing from queue
[12:42:53] Deleting current work unit & continuing...
[12:42:53] Trying to send all finished work units
[12:42:53] + No unsent completed units remaining.
[12:42:53] - Preparing to get new work unit...
[12:42:53] Cleaning up work directory
[12:42:53] + Attempting to get work packet
[12:42:53] Passkey found
[12:42:53] - Will indicate memory of 64426 MB
[12:42:53] - Connecting to assignment server
[12:42:53] Connecting to http://assign.stanford.edu:8080/
[12:42:55] Posted data.
[12:42:55] Initial: 8F80; - Successful: assigned to (128.143.231.201).
[12:42:55] + News From Folding@Home: Welcome to Folding@Home
[12:42:55] Loaded queue successfully.
[12:42:55] Sent data
[12:42:55] Connecting to http://128.143.231.201:8080/
[12:43:04] Posted data.
[12:43:04] Initial: 0000; - Receiving payload (expected size: 30302661)
[12:43:27] - Downloaded at ~1286 kB/s
[12:43:27] - Averaged speed for that direction ~1276 kB/s
[12:43:27] + Received work.
[12:43:27] + Closed connections
[12:43:32] 
[12:43:32] + Processing work unit
[12:43:32] Core required: FahCore_a5.exe
[12:43:32] Core found.
[12:43:32] Working on queue slot 08 [January 15 12:43:32 UTC]
[12:43:32] + Working ...
[12:43:32] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 08 -np 32 -checkpoint 15 -verbose -lifeline 7672 -version 634'

[12:43:32] 
[12:43:32] *------------------------------*
[12:43:32] Folding@Home Gromacs SMP Core
[12:43:32] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[12:43:32] 
[12:43:32] Preparing to commence simulation
[12:43:32] - Looking at optimizations...
[12:43:32] - Created dyn
[12:43:32] - Files status OK
[12:43:36] - Expanded 30302149 -> 33158020 (decompressed 109.4 percent)
[12:43:36] Called DecompressByteArray: compressed_data_size=30302149 data_size=33158020, decompressed_data_size=33158020 diff=0
[12:43:36] - Digital signature verified
[12:43:36] 
[12:43:36] Project: 8101 (Run 6, Clone 0, Gen 215)
[12:43:36] 
[12:43:37] Assembly optimizations on if available.
[12:43:37] Entering M.D.
[12:43:44] Mapping NT from 32 to 32 
[12:43:50] Completed 0 out of 250000 steps  (0%)
If you need anything else let me know and thanks. :-)
__________________
Folding stats
Folding Summary
Desktop: Corsair Carbide 400R; Asus Sabertooth X79; Intel i7-3930K; Hydro H100; 2 x XFX HD7970's; 16GB DDR3 1600MHz; Corsair HX850W; Samsung 256GB 840 Pro; 2TB HDD; Windows 7 Ult. 64bit.
Scorpuk is offline   Reply With Quote
Old 16th Jan 2013, 13:59   #2
Ben Lamb
Modder
 
Join Date: Sep 2012
Posts: 65
Ben Lamb has yet to learn the way of the Dremel
I dont think it is your machine scorp, I have looked into it, looks like a bug in FahCore_a3 and _a5 causes this problem but. Only thing you can do is try new workunits as you have been doing, one of my machines went haywire the other day for the first time so there may be some dodgy work units out there.
Ben Lamb is offline   Reply With Quote
Old 16th Jan 2013, 15:05   #3
Scorpuk
Supermodder
 
Join Date: Jan 2012
Location: North Ayrshire, Scotland
Posts: 502
Scorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run Crysis
Cheers.

I think I've picked up a workable unit. Seems to be progressing ok at about 17m TPF. A tad slower than normal.
__________________
Folding stats
Folding Summary
Desktop: Corsair Carbide 400R; Asus Sabertooth X79; Intel i7-3930K; Hydro H100; 2 x XFX HD7970's; 16GB DDR3 1600MHz; Corsair HX850W; Samsung 256GB 840 Pro; 2TB HDD; Windows 7 Ult. 64bit.
Scorpuk is offline   Reply With Quote
Old 17th Jan 2013, 12:23   #4
Scorpuk
Supermodder
 
Join Date: Jan 2012
Location: North Ayrshire, Scotland
Posts: 502
Scorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run CrysisScorpuk can run Crysis
Well it competed the unit, but now its working on an A3 work unit rather than an A5 work unit.
__________________
Folding stats
Folding Summary
Desktop: Corsair Carbide 400R; Asus Sabertooth X79; Intel i7-3930K; Hydro H100; 2 x XFX HD7970's; 16GB DDR3 1600MHz; Corsair HX850W; Samsung 256GB 840 Pro; 2TB HDD; Windows 7 Ult. 64bit.
Scorpuk is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 18:10.
Powered by: vBulletin Version 3
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.