OpenMPI
0.1.1
|
Part of the submit launcher. More...
#include "orte_config.h"
#include "opal/mca/mca.h"
#include "orte/mca/plm/plm.h"
#include "opal/threads/condition.h"
Go to the source code of this file.
Data Structures | |
struct | orte_plm_submit_component_t |
PLM Component. More... | |
Typedefs | |
typedef struct orte_plm_submit_component_t | orte_plm_submit_component_t |
Functions | |
BEGIN_C_DECLS int | orte_plm_submit_component_open (void) |
int | orte_plm_submit_component_close (void) |
int | orte_plm_submit_component_query (mca_base_module_t **module, int *priority) |
int | orte_plm_submit_finalize (void) |
int | orte_plm_submit_launch (orte_job_t *) |
Launch a daemon (bootproxy) on each node. More... | |
int | orte_plm_submit_terminate_orteds (void) |
Terminate the orteds for a given job. | |
int | orte_plm_submit_signal_job (orte_jobid_t, int32_t) |
Variables | |
ORTE_MODULE_DECLSPEC orte_plm_submit_component_t | mca_plm_submit_component |
orte_plm_base_module_t | orte_plm_submit_module |
Part of the submit launcher.
See plm_submit.h for an overview of how it works.
int orte_plm_submit_launch | ( | orte_job_t * | jdata | ) |
Launch a daemon (bootproxy) on each node.
The daemon will be responsible for launching the application.
If we are in '–debug-daemons' we keep the ssh connection alive for the span of the run. If we use this option AND we launch on more than "num_concurrent" machines then we will deadlock. No connections are terminated until the job is complete, no job is started since all the orteds are waiting for all the others to come online, and the others ore not launched because we are waiting on those that have started to terminate their ssh tunnels. :( As we cannot run in this situation, pretty print the error and return an error code.
References opal_pointer_array_t::addr, mca_base_param_environ_variable(), orte_node_t::name, orte_proc_info_t::nodename, opal_argv_append(), opal_argv_copy(), opal_argv_free(), opal_argv_join(), opal_basename(), opal_os_path(), opal_output(), OPAL_OUTPUT_VERBOSE, opal_setenv(), OPAL_THREAD_LOCK, OPAL_THREAD_UNLOCK, orte_plm_base_orted_append_basic_args(), orte_plm_globals, ORTE_PROC_MY_NAME, orte_process_info, orte_show_help(), orte_wait_cb(), orte_plm_globals_t::output, orte_app_context_t::prefix_dir, and orte_errmgr_base_module_2_3_0_t::update_state.