1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Folding problem on Opteron

Discussion in 'bit-tech Folding Team' started by Scorpuk, 15 Jan 2013.

  1. Scorpuk

    Scorpuk Minimodder

    Joined:
    10 Jan 2012
    Posts:
    725
    Likes Received:
    10
    Just noticed today that F@H has started to hang.

    I tired restarting it, but it eventually hangs.

    I then deleted the work folder and started again, but then it hangs again. (Done this twice and on the second time went for -smp 32)

    Once when HFM said it was hung I left it alone for over an hour and it then came back with the tpf jumping from 15m to 57m.


    Here is my current log file:

    Code:
    --- Opening Log file [January 15 08:46:16 UTC] 
    
    
    # Linux SMP Console Edition ###################################################
    ###############################################################################
    
                           Folding@Home Client Version 6.34
    
                              http://folding.stanford.edu
    
    ###############################################################################
    ###############################################################################
    
    Launch directory: /home/john/fah
    Executable: ./fah6
    Arguments: -smp -bigadv -verbosity 9 
    
    [08:46:16] - Ask before connecting: No
    [08:46:16] - User name: scorpuk (Team 35947)
    [08:46:16] - User ID: 69B84E3D5AC5DB27
    [08:46:16] - Machine ID: 1
    [08:46:16] 
    [08:46:16] Loaded queue successfully.
    [Jan46:1615 
    [08:46:16] - Autosending finished units... [Jan46:1615 08:46:16 UTC]
    [08:46:16] + Processing work unit
    [08:46:16] Trying to send all finished work units
    [08:46:16] Core required: FahCore_a5.exe
    [08:46:16] + No unsent completed units remaining.
    [08:46:16] - Autosend completed
    [08:46:16] Core found.
    [08:46:17] Working on queue slot 06 [January 15 08:46:17 UTC]
    [08:46:17] + Working ...
    [08:46:17] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 06 -np 64 -checkpoint 15 -verbose -lifeline 2220 -version 634'
    
    [08:46:17] 
    [08:46:17] *------------------------------*
    [08:46:17] Folding@Home Gromacs SMP Core
    [08:46:17] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
    [08:46:17] 
    [08:46:17] Preparing to commence simulation
    [08:46:17] - Looking at optimizations...
    [08:46:17] - Files status OK
    [08:46:22] - Expanded 30302149 -> 33158020 (decompressed 109.4 percent)
    [08:46:22] Called DecompressByteArray: compressed_data_size=30302149 data_size=33158020, decompressed_data_size=33158020 diff=0
    [08:46:22] - Digital signature verified
    [08:46:22] 
    [08:46:22] Project: 8101 (Run 6, Clone 0, Gen 215)
    [08:46:22] 
    [08:46:22] Assembly optimizations on if available.
    [08:46:22] Entering M.D.
    [08:46:28] Using Gromacs checkpoints
    [08:46:32] Mapping NT from 64 to 64 
    [08:48:56] Resuming from checkpoint
    [08:48:58] Verified work/wudata_06.log
    [08:48:59] Verified work/wudata_06.trr
    [08:48:59] Verified work/wudata_06.xtc
    [08:48:59] Verified work/wudata_06.edr
    [08:49:01] Completed 116240 out of 250000 steps  (46%)
    [09:23:30] ***** Got an Activate signal (2)
    [09:23:30] Killing all core threads
    
    Folding@Home Client Shutdown.
    
    
    --- Opening Log file [January 15 09:23:47 UTC] 
    
    
    # Linux SMP Console Edition ###################################################
    ###############################################################################
    
                           Folding@Home Client Version 6.34
    
                              http://folding.stanford.edu
    
    ###############################################################################
    ###############################################################################
    
    Launch directory: /home/john/fah
    Executable: ./fah6
    Arguments: -smp -bigadv -verbosity 9 
    
    [09:23:47] - Ask before connecting: No
    [09:23:47] - User name: scorpuk (Team 35947)
    [09:23:47] - User ID: 69B84E3D5AC5DB27
    [09:23:47] - Machine ID: 1
    [09:23:47] 
    [09:23:47] Work directory not found. Creating...
    [09:23:47] Loaded queue successfully.
    [09:23:47] 
    [09:23:47] + Processing work unit
    [09:23:47] Core required: FahCore_a5.exe
    [09:23:47] - Autosending finished units... [January 15 09:23:47 UTC]
    [09:23:47] Core found.
    [09:23:47] Trying to send all finished work units
    [09:23:47] + No unsent completed units remaining.
    [09:23:47] - Autosend completed
    [09:23:47] Working on queue slot 06 [January 15 09:23:47 UTC]
    [09:23:47] + Working ...
    [09:23:47] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 06 -np 64 -checkpoint 15 -verbose -lifeline 4103 -version 634'
    
    [09:23:47] 
    [09:23:47] *------------------------------*
    [09:23:47] Folding@Home Gromacs SMP Core
    [09:23:47] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
    [09:23:47] 
    [09:23:47] Preparing to commence simulation
    [09:23:47] - Looking at optimizations...
    [09:23:47] - Created dyn
    [09:23:47] - Files status OK
    [09:23:47] Error: Missing work file=<>
    [09:23:47] 
    [09:23:47] Folding@home Core Shutdown: MISSING_WORK_FILES
    [09:23:48] CoreStatus = 74 (116)
    [09:23:48] The core could not find the work files specified. Removing from queue
    [09:23:48] Deleting current work unit & continuing...
    [09:23:48] Trying to send all finished work units
    [09:23:48] + No unsent completed units remaining.
    [09:23:48] - Preparing to get new work unit...
    [09:23:48] Cleaning up work directory
    [09:23:48] + Attempting to get work packet
    [09:23:48] Passkey found
    [09:23:48] - Will indicate memory of 64426 MB
    [09:23:48] - Connecting to assignment server
    [09:23:48] Connecting to http://assign.stanford.edu:8080/
    [09:23:49] Posted data.
    [09:23:49] Initial: 8F80; - Successful: assigned to (128.143.231.201).
    [09:23:49] + News From Folding@Home: Welcome to Folding@Home
    [09:23:49] Loaded queue successfully.
    [09:23:49] Sent data
    [09:23:49] Connecting to http://128.143.231.201:8080/
    [09:23:58] Posted data.
    [09:23:58] Initial: 0000; - Receiving payload (expected size: 30302661)
    [09:24:20] - Downloaded at ~1345 kB/s
    [09:24:20] - Averaged speed for that direction ~1273 kB/s
    [09:24:20] + Received work.
    [09:24:20] + Closed connections
    [09:24:25] 
    [09:24:25] + Processing work unit
    [09:24:25] Core required: FahCore_a5.exe
    [09:24:25] Core found.
    [09:24:25] Working on queue slot 07 [January 15 09:24:25 UTC]
    [09:24:25] + Working ...
    [09:24:25] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 07 -np 64 -checkpoint 15 -verbose -lifeline 4103 -version 634'
    
    [09:24:25] 
    [09:24:25] *------------------------------*
    [09:24:25] Folding@Home Gromacs SMP Core
    [09:24:25] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
    [09:24:25] 
    [09:24:25] Preparing to commence simulation
    [09:24:25] - Looking at optimizations...
    [09:24:25] - Created dyn
    [09:24:25] - Files status OK
    [09:24:29] - Expanded 30302149 -> 33158020 (decompressed 109.4 percent)
    [09:24:29] Called DecompressByteArray: compressed_data_size=30302149 data_size=33158020, decompressed_data_size=33158020 diff=0
    [09:24:30] - Digital signature verified
    [09:24:30] 
    [09:24:30] Project: 8101 (Run 6, Clone 0, Gen 215)
    [09:24:30] 
    [09:24:30] Assembly optimizations on if available.
    [09:24:30] Entering M.D.
    [09:24:37] Mapping NT from 64 to 64 
    [09:24:44] Completed 0 out of 250000 steps  (0%)
    [09:40:01] Completed 2500 out of 250000 steps  (1%)
    [10:17:20] Completed 5000 out of 250000 steps  (2%)
    [12:17:00] Completed 7500 out of 250000 steps  (3%)
    [12:40:35] ***** Got an Activate signal (2)
    [12:40:35] Killing all core threads
    
    Folding@Home Client Shutdown.
    
    
    --- Opening Log file [January 15 12:40:42 UTC] 
    
    
    # Linux SMP Console Edition ###################################################
    ###############################################################################
    
                           Folding@Home Client Version 6.34
    
                              http://folding.stanford.edu
    
    ###############################################################################
    ###############################################################################
    
    Launch directory: /home/john/fah
    Executable: ./fah6
    Arguments: -smp 32 -bigadv -verbosity 9 
    
    [12:40:42] - Ask before connecting: No
    [12:40:42] - User name: scorpuk (Team 35947)
    [12:40:42] - User ID: 69B84E3D5AC5DB27
    [12:40:42] - Machine ID: 1
    [12:40:42] 
    [12:40:42] Loaded queue successfully.
    [12:40:42] 
    [12:40:42] + Processing work unit
    [12:40:42] Core required: FahCore_a5.exe
    [12:40:42] - Autosending finished units... [January 15 12:40:42 UTC]
    [12:40:42] Core found.
    [12:40:42] Trying to send all finished work units
    [12:40:42] + No unsent completed units remaining.
    [12:40:42] - Autosend completed
    [12:40:42] Working on queue slot 07 [January 15 12:40:42 UTC]
    [12:40:42] + Working ...
    [12:40:42] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 07 -np 32 -checkpoint 15 -verbose -lifeline 7580 -version 634'
    
    [12:40:42] 
    [12:40:42] *------------------------------*
    [12:40:42] Folding@Home Gromacs SMP Core
    [12:40:42] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
    [12:40:42] 
    [12:40:42] Preparing to commence simulation
    [12:40:42] - Looking at optimizations...
    [12:40:42] - Files status OK
    [12:40:46] - Expanded 30302149 -> 33158020 (decompressed 109.4 percent)
    [12:40:46] Called DecompressByteArray: compressed_data_size=30302149 data_size=33158020, decompressed_data_size=33158020 diff=0
    [12:40:47] - Digital signature verified
    [12:40:47] 
    [12:40:47] Project: 8101 (Run 6, Clone 0, Gen 215)
    [12:40:47] 
    [12:40:47] Assembly optimizations on if available.
    [12:40:47] Entering M.D.
    [12:40:53] Using Gromacs checkpoints
    [12:40:56] Mapping NT from 32 to 32 
    [12:41:39] Resuming from checkpoint
    [12:41:40] Verified work/wudata_07.log
    [12:41:40] Verified work/wudata_07.trr
    [12:41:41] Verified work/wudata_07.xtc
    [12:41:41] Verified work/wudata_07.edr
    [12:41:41] Completed 7980 out of 250000 steps  (3%)
    [12:42:42] ***** Got an Activate signal (2)
    [12:42:42] Killing all core threads
    
    Folding@Home Client Shutdown.
    
    
    --- Opening Log file [January 15 12:42:53 UTC] 
    
    
    # Linux SMP Console Edition ###################################################
    ###############################################################################
    
                           Folding@Home Client Version 6.34
    
                              http://folding.stanford.edu
    
    ###############################################################################
    ###############################################################################
    
    Launch directory: /home/john/fah
    Executable: ./fah6
    Arguments: -smp 32 -bigadv -verbosity 9 
    
    [12:42:53] - Ask before connecting: No
    [12:42:53] - User name: scorpuk (Team 35947)
    [12:42:53] - User ID: 69B84E3D5AC5DB27
    [12:42:53] - Machine ID: 1
    [12:42:53] 
    [12:42:53] Work directory not found. Creating...
    [12:42:53] Loaded queue successfully.
    [12:42:53] 
    [12:42:53] - Autosending finished units... [January 15 12:42:53 UTC]
    [12:42:53] + Processing work unit
    [12:42:53] Trying to send all finished work units
    [12:42:53] Core required: FahCore_a5.exe
    [12:42:53] + No unsent completed units remaining.
    [12:42:53] Core found.
    [12:42:53] - Autosend completed
    [12:42:53] Working on queue slot 07 [January 15 12:42:53 UTC]
    [12:42:53] + Working ...
    [12:42:53] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 07 -np 32 -checkpoint 15 -verbose -lifeline 7672 -version 634'
    
    [12:42:53] 
    [12:42:53] *------------------------------*
    [12:42:53] Folding@Home Gromacs SMP Core
    [12:42:53] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
    [12:42:53] 
    [12:42:53] Preparing to commence simulation
    [12:42:53] - Looking at optimizations...
    [12:42:53] - Created dyn
    [12:42:53] - Files status OK
    [12:42:53] Error: Missing work file=<>
    [12:42:53] 
    [12:42:53] Folding@home Core Shutdown: MISSING_WORK_FILES
    [12:42:53] CoreStatus = 74 (116)
    [12:42:53] The core could not find the work files specified. Removing from queue
    [12:42:53] Deleting current work unit & continuing...
    [12:42:53] Trying to send all finished work units
    [12:42:53] + No unsent completed units remaining.
    [12:42:53] - Preparing to get new work unit...
    [12:42:53] Cleaning up work directory
    [12:42:53] + Attempting to get work packet
    [12:42:53] Passkey found
    [12:42:53] - Will indicate memory of 64426 MB
    [12:42:53] - Connecting to assignment server
    [12:42:53] Connecting to http://assign.stanford.edu:8080/
    [12:42:55] Posted data.
    [12:42:55] Initial: 8F80; - Successful: assigned to (128.143.231.201).
    [12:42:55] + News From Folding@Home: Welcome to Folding@Home
    [12:42:55] Loaded queue successfully.
    [12:42:55] Sent data
    [12:42:55] Connecting to http://128.143.231.201:8080/
    [12:43:04] Posted data.
    [12:43:04] Initial: 0000; - Receiving payload (expected size: 30302661)
    [12:43:27] - Downloaded at ~1286 kB/s
    [12:43:27] - Averaged speed for that direction ~1276 kB/s
    [12:43:27] + Received work.
    [12:43:27] + Closed connections
    [12:43:32] 
    [12:43:32] + Processing work unit
    [12:43:32] Core required: FahCore_a5.exe
    [12:43:32] Core found.
    [12:43:32] Working on queue slot 08 [January 15 12:43:32 UTC]
    [12:43:32] + Working ...
    [12:43:32] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 08 -np 32 -checkpoint 15 -verbose -lifeline 7672 -version 634'
    
    [12:43:32] 
    [12:43:32] *------------------------------*
    [12:43:32] Folding@Home Gromacs SMP Core
    [12:43:32] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
    [12:43:32] 
    [12:43:32] Preparing to commence simulation
    [12:43:32] - Looking at optimizations...
    [12:43:32] - Created dyn
    [12:43:32] - Files status OK
    [12:43:36] - Expanded 30302149 -> 33158020 (decompressed 109.4 percent)
    [12:43:36] Called DecompressByteArray: compressed_data_size=30302149 data_size=33158020, decompressed_data_size=33158020 diff=0
    [12:43:36] - Digital signature verified
    [12:43:36] 
    [12:43:36] Project: 8101 (Run 6, Clone 0, Gen 215)
    [12:43:36] 
    [12:43:37] Assembly optimizations on if available.
    [12:43:37] Entering M.D.
    [12:43:44] Mapping NT from 32 to 32 
    [12:43:50] Completed 0 out of 250000 steps  (0%)
    If you need anything else let me know and thanks. :)
     
  2. Ben Lamb

    Ben Lamb What's a Dremel?

    Joined:
    2 Sep 2012
    Posts:
    65
    Likes Received:
    1
    I dont think it is your machine scorp, I have looked into it, looks like a bug in FahCore_a3 and _a5 causes this problem but. Only thing you can do is try new workunits as you have been doing, one of my machines went haywire the other day for the first time so there may be some dodgy work units out there.
     
  3. Scorpuk

    Scorpuk Minimodder

    Joined:
    10 Jan 2012
    Posts:
    725
    Likes Received:
    10
    Cheers.

    I think I've picked up a workable unit. Seems to be progressing ok at about 17m TPF. A tad slower than normal.
     
  4. Scorpuk

    Scorpuk Minimodder

    Joined:
    10 Jan 2012
    Posts:
    725
    Likes Received:
    10
    Well it competed the unit, but now its working on an A3 work unit rather than an A5 work unit. :sigh:
     

Share This Page