1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

P6900 Hang's

Discussion in 'bit-tech Folding Team' started by Scorpuk, 21 Jan 2012.

  1. Scorpuk

    Scorpuk Minimodder

    Joined:
    10 Jan 2012
    Posts:
    725
    Likes Received:
    10
    My SMP console has hung three times doing this unit.

    Each time I need to terminate the process before running again and each time it only ever gets another 1% done before hanigng again.


    I've dropped it down to 11 threads to make sure me using the comp aint doing anything.


    Any suggestions?
     
  2. TaRkA DaHl

    TaRkA DaHl Modder

    Joined:
    15 Mar 2011
    Posts:
    1,702
    Likes Received:
    175
    What are the temps and oc?

    Is it a BSOD or just the console that hangs?
     
  3. Scorpuk

    Scorpuk Minimodder

    Joined:
    10 Jan 2012
    Posts:
    725
    Likes Received:
    10
    Just the console is hanging.

    No overclocking. Back to stock speeds at the moment.

    Core temps in the range of 52°C to 58°C.


    It has done 2% more ths time. Will find out in a few minutes if its crashed again or I get +3%.


    Edit: Now +3%. Might be ok now. *shrug*
     
    Last edited: 21 Jan 2012
  4. Tattysnuc

    Tattysnuc Thinking about which mod to do 1st.

    Joined:
    19 Jul 2009
    Posts:
    1,620
    Likes Received:
    60
    I've never heard of problems with the one I use 6.34 so am completely flummoxed.

    Which version of the console are you using?

    How do you know it's crashed? Can you post the log up?
     
  5. Scorpuk

    Scorpuk Minimodder

    Joined:
    10 Jan 2012
    Posts:
    725
    Likes Received:
    10
    Its working now ok with the 11 threads.

    The reason I thought it had crashed was that HFM status said "hung" for the client and it had not progressed for over an hour with a tpf of ~27 mins.


    Copy of log for 1st 2 times I had to kill it.


    Code:
    Arguments: -smp -bigadv -verbosity 9 -smp 12 -bigadv 
    
    [23:46:53] - Ask before connecting: No
    [23:46:53] - User name: Scorpuk (Team 35947)
    [23:46:53] - User ID: 32F3E3910B26857F
    [23:46:53] - Machine ID: 11
    [23:46:53] 
    [23:46:53] Loaded queue successfully.
    [23:46:53] 
    [23:46:53] - Autosending finished units... [January 20 23:46:53 UTC]
    [23:46:53] + Processing work unit
    [23:46:53] Trying to send all finished work units
    [23:46:53] Core required: FahCore_a5.exe
    [23:46:53] + No unsent completed units remaining.
    [23:46:53] Core found.
    [23:46:53] - Autosend completed
    [23:46:53] Working on queue slot 03 [January 20 23:46:53 UTC]
    [23:46:53] + Working ...
    [23:46:53] - Calling '.\FahCore_a5.exe -dir work/ -nice 19 -suffix 03 -np 12 -checkpoint 15 -verbose -lifeline 3948 -version 634'
    
    [23:46:53] 
    [23:46:53] *------------------------------*
    [23:46:53] Folding@Home Gromacs SMP Core
    [23:46:53] Version 2.27 (Mar 12, 2010)
    [23:46:53] 
    [23:46:53] Preparing to commence simulation
    [23:46:53] - Ensuring status. Please wait.
    [23:47:03] - Looking at optimizations...
    [23:47:03] - Working with standard loops on this execution.
    [23:47:03] - Previous termination of core was improper.
    [23:47:03] - Files status OK
    [23:47:07] - Expanded 24867828 -> 30796292 (decompressed 123.8 percent)
    [23:47:07] Called DecompressByteArray: compressed_data_size=24867828 data_size=30796292, decompressed_data_size=30796292 diff=0
    [23:47:07] - Digital signature verified
    [23:47:07] 
    [23:47:07] Project: 6900 (Run 44, Clone 4, Gen 102)
    [23:47:07] 
    [23:47:07] Entering M.D.
    [23:47:13] Using Gromacs checkpoints
    [23:47:14] Mapping NT from 12 to 12 
    [23:47:19] Resuming from checkpoint
    [23:47:20] Verified work/wudata_03.log
    [23:47:20] Verified work/wudata_03.trr
    [23:47:20] Verified work/wudata_03.xtc
    [23:47:20] Verified work/wudata_03.edr
    [23:47:21] Completed 69840 out of 250000 steps  (27%)
    [23:48:57] Completed 70000 out of 250000 steps  (28%)
    [00:11:16] Completed 72500 out of 250000 steps  (29%)
    [01:00:08] Completed 75000 out of 250000 steps  (30%)
    [02:18:13] Completed 77500 out of 250000 steps  (31%)
    [02:49:32] Completed 80000 out of 250000 steps  (32%)
    [03:18:35] Completed 82500 out of 250000 steps  (33%)
    [03:47:05] Completed 85000 out of 250000 steps  (34%)
    [04:15:40] Completed 87500 out of 250000 steps  (35%)
    [04:44:11] Completed 90000 out of 250000 steps  (36%)
    [05:12:10] Completed 92500 out of 250000 steps  (37%)
    [05:41:55] Completed 95000 out of 250000 steps  (38%)
    [05:46:53] - Autosending finished units... [January 21 05:46:53 UTC]
    [05:46:53] Trying to send all finished work units
    [05:46:53] + No unsent completed units remaining.
    [05:46:53] - Autosend completed
    [06:11:33] Completed 97500 out of 250000 steps  (39%)
    [06:41:10] Completed 100000 out of 250000 steps  (40%)
    [07:10:01] Completed 102500 out of 250000 steps  (41%)
    [07:39:38] Completed 105000 out of 250000 steps  (42%)
    [08:09:31] Completed 107500 out of 250000 steps  (43%)
    [08:39:08] Completed 110000 out of 250000 steps  (44%)
    [09:08:57] Completed 112500 out of 250000 steps  (45%)
    [09:39:00] Completed 115000 out of 250000 steps  (46%)
    [10:09:17] Completed 117500 out of 250000 steps  (47%)
    [10:39:27] Completed 120000 out of 250000 steps  (48%)
    [11:08:32] Completed 122500 out of 250000 steps  (49%)
    [11:38:22] Completed 125000 out of 250000 steps  (50%)
    [11:46:53] - Autosending finished units... [January 21 11:46:53 UTC]
    [11:46:53] Trying to send all finished work units
    [11:46:53] + No unsent completed units remaining.
    [11:46:53] - Autosend completed
    [12:08:16] Completed 127500 out of 250000 steps  (51%)
    [12:38:26] Completed 130000 out of 250000 steps  (52%)
    [13:07:48] Completed 132500 out of 250000 steps  (53%)
    [14:12:22] Killing all core threads
    [14:12:22] Could not get process id information.  Please kill core process manually
    
    Folding@Home Client Shutdown at user request.
    [14:12:22] ***** Got a SIGTERM signal (2)
    [14:12:22] Killing all core threads
    [14:12:22] Could not get process id information.  Please kill core process manually
    
    Folding@Home Client Shutdown.
    
    
    --- Opening Log file [January 21 14:12:49 UTC] 
    
    
    # Windows SMP Console Edition #################################################
    ###############################################################################
    
                           Folding@Home Client Version 6.34
    
                              http://folding.stanford.edu
    
    ###############################################################################
    ###############################################################################
    
    Launch directory: C:\Users\John\FAH-SMP
    Executable: fah6
    Arguments: -verbosity 9 -smp 12 -bigadv 
    
    [14:12:49] - Ask before connecting: No
    [14:12:49] - User name: Scorpuk (Team 35947)
    [14:12:49] - User ID: 32F3E3910B26857F
    [14:12:49] - Machine ID: 11
    [14:12:49] 
    [14:12:49] Loaded queue successfully.
    [14:12:49] 
    [14:12:49] - Autosending finished units... [January 21 14:12:49 UTC]
    [14:12:49] + Processing work unit
    [14:12:49] Trying to send all finished work units
    [14:12:49] Core required: FahCore_a5.exe
    [14:12:49] + No unsent completed units remaining.
    [14:12:49] Core found.
    [14:12:49] - Autosend completed
    [14:12:49] Working on queue slot 03 [January 21 14:12:49 UTC]
    [14:12:49] + Working ...
    [14:12:49] - Calling '.\FahCore_a5.exe -dir work/ -nice 19 -suffix 03 -np 12 -checkpoint 15 -verbose -lifeline 3596 -version 634'
    
    [14:12:49] 
    [14:12:49] *------------------------------*
    [14:12:49] Folding@Home Gromacs SMP Core
    [14:12:49] Version 2.27 (Mar 12, 2010)
    [14:12:49] 
    [14:12:49] Preparing to commence simulation
    [14:12:49] - Ensuring status. Please wait.
    [14:12:59] - Looking at optimizations...
    [14:12:59] - Working with standard loops on this execution.
    [14:12:59] - Previous termination of core was improper.
    [14:12:59] - Going to use standard loops.
    [14:12:59] - Files status OK
    [14:13:04] - Expanded 24867828 -> 30796292 (decompressed 123.8 percent)
    [14:13:04] Called DecompressByteArray: compressed_data_size=24867828 data_size=30796292, decompressed_data_size=30796292 diff=0
    [14:13:04] - Digital signature verified
    [14:13:04] 
    [14:13:04] Project: 6900 (Run 44, Clone 4, Gen 102)
    [14:13:04] 
    [14:13:04] Entering M.D.
    [14:13:10] Using Gromacs checkpoints
    [14:13:11] Mapping NT from 12 to 12 
    [14:13:16] Resuming from checkpoint
    [14:13:17] Verified work/wudata_03.log
    [14:13:17] Verified work/wudata_03.trr
    [14:13:18] Verified work/wudata_03.xtc
    [14:13:18] Verified work/wudata_03.edr
    [14:13:18] Completed 134070 out of 250000 steps  (53%)
    [14:34:14] Completed 135000 out of 250000 steps  (54%)
    [15:44:12] Killing all core threads
    [15:44:12] Could not get process id information.  Please kill core process manually
    
    Folding@Home Client Shutdown at user request.
    [15:44:12] ***** Got a SIGTERM signal (2)
    [15:44:12] Killing all core threads
    [15:44:12] Could not get process id information.  Please kill core process manually
    
    Folding@Home Client Shutdown.
    As you can see I also had to manually terminate the process. :eyebrow:
     

Share This Page