OpenMPI  0.1.1
pls_submit.h File Reference

Part of the submit launcher. More...

#include "orte_config.h"
#include "opal/mca/mca.h"
#include "orte/mca/plm/plm.h"
#include "opal/threads/condition.h"

Go to the source code of this file.

Data Structures

struct  orte_plm_submit_component_t
 PLM Component. More...
 

Typedefs

typedef struct
orte_plm_submit_component_t 
orte_plm_submit_component_t
 

Functions

BEGIN_C_DECLS int orte_plm_submit_component_open (void)
 
int orte_plm_submit_component_close (void)
 
int orte_plm_submit_component_query (mca_base_module_t **module, int *priority)
 
int orte_plm_submit_finalize (void)
 
int orte_plm_submit_launch (orte_job_t *)
 Launch a daemon (bootproxy) on each node. More...
 
int orte_plm_submit_terminate_orteds (void)
 Terminate the orteds for a given job.
 
int orte_plm_submit_signal_job (orte_jobid_t, int32_t)
 

Variables

ORTE_MODULE_DECLSPEC
orte_plm_submit_component_t 
mca_plm_submit_component
 
orte_plm_base_module_t orte_plm_submit_module
 

Detailed Description

Part of the submit launcher.

See plm_submit.h for an overview of how it works.

Function Documentation

int orte_plm_submit_launch ( orte_job_t jdata)

Launch a daemon (bootproxy) on each node.

The daemon will be responsible for launching the application.

If we are in '–debug-daemons' we keep the ssh connection alive for the span of the run. If we use this option AND we launch on more than "num_concurrent" machines then we will deadlock. No connections are terminated until the job is complete, no job is started since all the orteds are waiting for all the others to come online, and the others ore not launched because we are waiting on those that have started to terminate their ssh tunnels. :( As we cannot run in this situation, pretty print the error and return an error code.

References opal_pointer_array_t::addr, mca_base_param_environ_variable(), orte_node_t::name, orte_proc_info_t::nodename, opal_argv_append(), opal_argv_copy(), opal_argv_free(), opal_argv_join(), opal_basename(), opal_os_path(), opal_output(), OPAL_OUTPUT_VERBOSE, opal_setenv(), OPAL_THREAD_LOCK, OPAL_THREAD_UNLOCK, orte_plm_base_orted_append_basic_args(), orte_plm_globals, ORTE_PROC_MY_NAME, orte_process_info, orte_show_help(), orte_wait_cb(), orte_plm_globals_t::output, orte_app_context_t::prefix_dir, and orte_errmgr_base_module_2_3_0_t::update_state.