OpenMPI
0.1.1
|
Interface for waitpid / async notification of child death with the libevent runtime system. More...
#include "orte_config.h"
#include "opal/dss/dss.h"
#include "opal/util/output.h"
#include "opal/sys/atomic.h"
#include "opal/mca/event/event.h"
#include "orte/types.h"
#include "orte/mca/rml/rml_types.h"
#include "opal/runtime/opal_progress.h"
Go to the source code of this file.
Data Structures | |
struct | orte_trigger_event_t |
struct | orte_message_event_t |
Setup an event to process a message. More... | |
struct | orte_notify_event_t |
Macros | |
#define | ORTE_PROGRESSED_WAIT(failed, counter, limit) |
In a number of places in the code, we need to wait for something to complete - for example, waiting for all launched procs to report into the HNP. More... | |
#define | ORTE_MESSAGE_EVENT_DELAY(delay, mev) |
#define | ORTE_MESSAGE_EVENT(sndr, buf, tg, cbfunc) |
#define | ORTE_NOTIFY_EVENT(cbfunc, data) |
#define | ORTE_DETECT_TIMEOUT(event, n, deltat, maxwait, cbfunc) |
In a number of places within the code, we want to setup a timer to detect when some procedure failed to complete. More... | |
#define | ORTE_TIMER_EVENT(sec, usec, cbfunc) |
There are places in the code where we just want to periodically wakeup to do something, and then go back to sleep again. More... | |
Typedefs | |
typedef void(* | orte_wait_fn_t )(pid_t wpid, int status, void *data) |
typedef for callback function used in ompi_rte_wait_cb | |
Functions | |
ORTE_DECLSPEC | OBJ_CLASS_DECLARATION (orte_trigger_event_t) |
ORTE_DECLSPEC void | orte_wait_enable (void) |
Disable / re-Enable SIGCHLD handler. More... | |
ORTE_DECLSPEC void | orte_wait_disable (void) |
ORTE_DECLSPEC pid_t | orte_waitpid (pid_t wpid, int *status, int options) |
Wait for process terminiation. More... | |
ORTE_DECLSPEC int | orte_wait_cb (pid_t wpid, orte_wait_fn_t callback, void *data) |
Register a callback for process termination. More... | |
ORTE_DECLSPEC int | orte_wait_cb_cancel (pid_t wpid) |
ORTE_DECLSPEC int | orte_wait_cb_disable (void) |
ORTE_DECLSPEC int | orte_wait_cb_enable (void) |
ORTE_DECLSPEC int | orte_wait_event (opal_event_t **event, orte_trigger_event_t *trig, char *trigger_name, void(*cbfunc)(int, short, void *)) |
Setup to wait for an event. More... | |
ORTE_DECLSPEC void | orte_trigger_event (orte_trigger_event_t *trig) |
Trigger a defined event. More... | |
ORTE_DECLSPEC | OBJ_CLASS_DECLARATION (orte_message_event_t) |
ORTE_DECLSPEC | OBJ_CLASS_DECLARATION (orte_notify_event_t) |
ORTE_DECLSPEC int | orte_wait_init (void) |
ORTE_DECLSPEC int | orte_wait_kill (int sig) |
Kill all processes we are waiting on. | |
ORTE_DECLSPEC int | orte_wait_finalize (void) |
Interface for waitpid / async notification of child death with the libevent runtime system.
#define ORTE_DETECT_TIMEOUT | ( | event, | |
n, | |||
deltat, | |||
maxwait, | |||
cbfunc | |||
) |
In a number of places within the code, we want to setup a timer to detect when some procedure failed to complete.
For example, when we launch the daemons, we frequently have no way to directly detect that a daemon failed to launch. Setting a timer allows us to automatically fail out of the launch if we don't hear from a daemon in some specified time window.
Computing the amount of time to wait takes a few lines of code, but this macro encapsulates those lines along with the timer event definition just as a convenience. It also centralizes the necessary checks to ensure that the microsecond field is always less than 1M since some systems care about that, and to ensure that the computed wait time doesn't exceed the desired max wait
Referenced by orte_plm_base_orted_exit().
#define ORTE_MESSAGE_EVENT | ( | sndr, | |
buf, | |||
tg, | |||
cbfunc | |||
) |
#define ORTE_MESSAGE_EVENT_DELAY | ( | delay, | |
mev | |||
) |
#define ORTE_NOTIFY_EVENT | ( | cbfunc, | |
data | |||
) |
#define ORTE_PROGRESSED_WAIT | ( | failed, | |
counter, | |||
limit | |||
) |
In a number of places in the code, we need to wait for something to complete - for example, waiting for all launched procs to report into the HNP.
In such cases, we want to just call progress so that any messages get processed, but otherwise "hold" the program at this spot until the counter achieves the specified value. We also want to provide a boolean flag, though, so that we break out of the loop should something go wrong.
Referenced by orte_plm_base_orted_exit().
#define ORTE_TIMER_EVENT | ( | sec, | |
usec, | |||
cbfunc | |||
) |
There are places in the code where we just want to periodically wakeup to do something, and then go back to sleep again.
Setting a timer allows us to do this
ORTE_DECLSPEC void orte_trigger_event | ( | orte_trigger_event_t * | trig | ) |
Trigger a defined event.
This function will trigger a previously-defined event - as setup by orte_wait_event - by firing the provided trigger
ORTE_DECLSPEC int orte_wait_cb | ( | pid_t | wpid, |
orte_wait_fn_t | callback, | ||
void * | data | ||
) |
Register a callback for process termination.
Register a callback for notification when wpid
causes a SIGCHLD. waitpid()
will have already been called on the process at this time.
If a thread is already blocked in ompi_rte_waitpid
for wpid
, this function will return ORTE_ERR_EXISTS
. It is illegal for multiple callbacks to be registered for a single wpid
(OMPI_EXISTS will be returned in this case).
wpid
to be -1 when registering a callback. Referenced by orte_plm_submit_launch().
ORTE_DECLSPEC void orte_wait_enable | ( | void | ) |
Disable / re-Enable SIGCHLD handler.
These functions have to be used after orte_wait_init was called.
ORTE_DECLSPEC int orte_wait_event | ( | opal_event_t ** | event, |
orte_trigger_event_t * | trig, | ||
char * | trigger_name, | ||
void(*)(int, short, void *) | cbfunc | ||
) |
Setup to wait for an event.
This function is used to setup a trigger event that can be used elsewhere in the code base where we want to wait for some event to happen. For example, orterun uses this function to setup an event that is used to notify orterun of abnormal and normal termination so it can wakeup and exit cleanly.
The event will be defined so that firing the provided trigger will cause the event to trigger and callback to the provided function
ORTE_DECLSPEC pid_t orte_waitpid | ( | pid_t | wpid, |
int * | status, | ||
int | options | ||
) |
Wait for process terminiation.
Similar to waitpid
, orte_waitpid
utilizes the run-time event library for process terminiation notification. The WUNTRACED
option is not supported, but the WNOHANG
option is supported.
wpid
value of -1
is not currently supported and will return an error.