Changes between Version 1 and Version 2 of Documentation/Short/LoadImage


Ignore:
Timestamp:
Jan 29, 2013, 11:04:29 PM (11 years ago)
Author:
seskar
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Documentation/Short/LoadImage

    v1 v2  
    11== Load an Image ==
    22
    3  1. Before we begin using the nodes, it's a good idea to check their status first. This is done with the omf stat command. This will typically produce a result like:
    4     {{{
    5  user@console.sb7:~$ omf stat
     3 1. Before we begin using the nodes, it's a good idea to check their status first. This is done with the omf stat command.
     4    [[Include(/Software/cOMF/eStat)]]
    65
    7  INFO NodeHandler: OMF Experiment Controller 5.4 (git c005675)
    8  INFO NodeHandler: Slice ID: default_slice (default)
    9  INFO NodeHandler: Experiment ID: default_slice-2013-01-16t15.28.15-05.00
    10  INFO NodeHandler: Message authentication is disabled
    11  INFO Experiment: load system:exp:stdlib
    12  INFO property.resetDelay: resetDelay = 230 (Fixnum)
    13  INFO property.resetTries: resetTries = 1 (Fixnum)
    14  INFO Experiment: load system:exp:eventlib
    15  INFO Experiment: load system:exp:stat
    16  INFO Topology: Loading topology ''.
    17  INFO property.nodes: nodes = "system:topo:all" (String)
    18  INFO property.summary: summary = false (FalseClass)
    19  INFO Topology: Loading topology 'system:topo:all'.
    20  Talking to the CMC service, please wait
    21  -----------------------------------------------
    22  Domain: sb7.orbit-lab.org
    23  Node: node1-1.sb7.orbit-lab.org         State: POWEROFF
    24  Node: node1-2.sb7.orbit-lab.org         State: POWEROFF
    25  -----------------------------------------------
    26  INFO EXPERIMENT_DONE: Event triggered. Starting the associated tasks.
    27  INFO NodeHandler:
    28  INFO NodeHandler: Shutting down experiment, please wait...
    29  INFO NodeHandler:
    30  INFO run: Experiment default_slice-2013-01-16t15.28.15-05.00 finished after 0:6
    31     }}}
    32     Individual nodes are identified by their fully qualified domain name (FQDN). This establishes their "coordinates" and the "domain" to which they belong. Nodes in
    33     different domains can NOT see each other.
     6 2. It is recommended that the node be in the POWEROFF state prior to any experiment process. If the node is in the POWERON state you can use the omf tell command
     7    to get the node into the off state.
     8    [[Include(/Software/cOMF/eStatus)]]
    349
    35  2. Node can be in 1 of 3 states:
    36 
    37     || POWEROFF       || Node is Available for use but turned off ||
    38     || POWERON        || Node is Available and is on ||
    39     || NOT REGISTERED || Node is not Available for use ||
    40 
    41  3. It is recommended that the node be in the POWEROFF state prior to any experiment process. If the node is in the POWERON state you can use the omf tell command
    42     to get the node into the off state.
    43     {{{
    44     username@console.domain:~$ omf tell -a offh -t TOPOLOGY
    45     }}}
    46     The ''TOPOLOGY'' can take on many forms, the simplest being a comma separated list of FQDN's. There are special predefined topologies like: all, system:topo:circle, ...
    4710    For more details see [wiki:/Software/cOMF OMF documentation]
    48     If the node is in the NOT REGISTERED state, you may need to wait for it to recover the POWEROFF state (it some times requires a few moments for the services to sync up). If
    49     the node never comes out of the NODE NOT AVAILABLE state please contact an administrator.
     11    If the node is in the NOT REGISTERED state, you may need to wait for it to recover the POWEROFF state (it some times requires a few moments for the services to sync up). If the node takes more than 60 seconds to come out of the NODE NOT AVAILABLE state please report it to an administrator.
    5012
    5113 4. Prior to the experiment, users need to install an image on the hard disks of the nodes. If you have not created a custom image use the default starting image:
    5214    '''baseline.ndz'''. This image is built on top of '''Ubuntu 12.04''', and is pre-configured with the proper modules and start up scripts to take advantage of the rest of
    53     the Orbit services / hardware.  Loading an image is done with the [wiki:/Software/cOMF#load omf load command].
    54     {{{
    55     username@console.domain:~$ omf load -t TOPOLOGY -i IMAGENAME
    56     }}}
    57     Where ''TOPOLOGY'' is the set of nodes you wish to image , and !IMAGENAME is the name of the image you with to load. The most common sandbox starting image command
    58     would look like
    59     {{{
    60     username@console.domain:~$ omf load -t all -i baseline.ndz
    61     }}}
    62     which will load all the nodes of sandbox 1 (totaling 1) with the [wiki:Documentation/SupportedImages baseline] image. An example run on sandbox 7 looks like:
    63     {{{
    64 user@console.sb7:~$ omf load -t all -i baseline.ndz
    65 
    66  INFO NodeHandler: OMF Experiment Controller 5.4 (git c005675)
    67  INFO NodeHandler: Slice ID: pxe_slice
    68  INFO NodeHandler: Experiment ID: pxe_slice-2013-01-16t14.56.02-05.00
    69  INFO NodeHandler: Message authentication is disabled
    70  INFO Experiment: load system:exp:stdlib
    71  INFO property.resetDelay: resetDelay = 230 (Fixnum)
    72  INFO property.resetTries: resetTries = 1 (Fixnum)
    73  INFO Experiment: load system:exp:eventlib
    74  INFO Experiment: load system:exp:imageNode
    75  INFO property.nodes: nodes = "system:topo:all" (String)
    76  INFO property.image: image = "baseline.ndz" (String)
    77  INFO property.domain: domain = "sb7.orbit-lab.org" (String)
    78  INFO property.outpath: outpath = "/tmp" (String)
    79  INFO property.outprefix: outprefix = "pxe_slice-2013-01-16t14.56.02-05.00" (String)
    80  INFO property.timeout: timeout = 800 (Fixnum)                                                                                         
    81  INFO property.resize: resize = nil (NilClass)
    82  INFO Topology: Loading topology 'system:topo:all'.
    83  INFO Experiment: Resetting resources
    84  INFO stdlib: Waiting for nodes (Up/Down/Total): 0/2/2 - (still down: node1-2.sb7.orbit-lab.org,node1-1.sb7.orbit-lab.org) [0 sec.]
    85  INFO stdlib: Waiting for nodes (Up/Down/Total): 0/2/2 - (still down: node1-2.sb7.orbit-lab.org,node1-1.sb7.orbit-lab.org) [10 sec.]
    86  INFO stdlib: Waiting for nodes (Up/Down/Total): 0/2/2 - (still down: node1-2.sb7.orbit-lab.org,node1-1.sb7.orbit-lab.org) [20 sec.]
    87  INFO stdlib: Waiting for nodes (Up/Down/Total): 0/2/2 - (still down: node1-2.sb7.orbit-lab.org,node1-1.sb7.orbit-lab.org) [30 sec.]
    88  INFO ALL_UP: Event triggered. Starting the associated tasks.
    89  INFO exp: Progress(0/0/2): 0/0/0 min(node1-2.sb7.orbit-lab.org)/avg/max (30) - Timeout: 760 sec.
    90  INFO exp: Progress(0/0/2): 10/10/10 min(node1-2.sb7.orbit-lab.org)/avg/max (30) - Timeout: 750 sec.
    91  INFO exp: Progress(0/0/2): 10/15/20 min(node1-1.sb7.orbit-lab.org)/avg/max (30) - Timeout: 740 sec.
    92  INFO exp: Progress(0/0/2): 20/25/30 min(node1-2.sb7.orbit-lab.org)/avg/max (30) - Timeout: 730 sec.
    93  INFO exp: Progress(0/0/2): 30/35/40 min(node1-1.sb7.orbit-lab.org)/avg/max (30) - Timeout: 720 sec.
    94  INFO exp: Progress(0/0/2): 40/40/40 min(node1-2.sb7.orbit-lab.org)/avg/max (30) - Timeout: 710 sec.
    95  INFO exp: Progress(0/0/2): 40/45/50 min(node1-1.sb7.orbit-lab.org)/avg/max (30) - Timeout: 700 sec.
    96  INFO exp: Progress(0/0/2): 50/55/60 min(node1-1.sb7.orbit-lab.org)/avg/max (30) - Timeout: 690 sec.
    97  INFO exp: Progress(0/0/2): 60/65/70 min(node1-1.sb7.orbit-lab.org)/avg/max (30) - Timeout: 680 sec.
    98  INFO exp: Progress(0/0/2): 60/65/70 min(node1-1.sb7.orbit-lab.org)/avg/max (30) - Timeout: 670 sec.
    99  INFO exp: Progress(0/0/2): 70/75/80 min(node1-2.sb7.orbit-lab.org)/avg/max (30) - Timeout: 660 sec.
    100  INFO exp: Progress(0/0/2): 90/90/90 min(node1-2.sb7.orbit-lab.org)/avg/max (30) - Timeout: 650 sec.
    101  INFO exp: Progress(1/0/2): 90/95/100 min(node1-1.sb7.orbit-lab.org)/avg/max (30) - Timeout: 640 sec.
    102  INFO exp: Progress(2/0/2): 100/100/100 min()/avg/max (30) - Timeout: 630 sec.
    103  INFO exp:  -----------------------------
    104  INFO exp:  Imaging Process Done
    105  INFO exp:  2 nodes successfully imaged - Topology saved in '/tmp/pxe_slice-2013-01-16t14.56.02-05.00-topo-success.rb'
    106  INFO exp:  -----------------------------
    107  INFO EXPERIMENT_DONE: Event triggered. Starting the associated tasks.
    108  INFO NodeHandler:
    109  INFO NodeHandler: Shutting down experiment, please wait...
    110  INFO NodeHandler:
    111  INFO NodeHandler: Shutdown flag is set - Turning Off the resources
    112  INFO run: Experiment pxe_slice-2013-01-16t14.56.02-05.00 finished after 3:13
    113     }}}
     15    the Orbit services / hardware.