OpenMPI  0.1.1
coll_sm_bcast.c File Reference
#include "ompi_config.h"
#include "opal/datatype/opal_convertor.h"
#include "ompi/constants.h"
#include "ompi/communicator/communicator.h"
#include "ompi/datatype/ompi_datatype.h"
#include "ompi/mca/coll/coll.h"
#include "opal/sys/atomic.h"
#include "coll_sm.h"

Functions

int mca_coll_sm_bcast_intra (void *buff, int count, struct ompi_datatype_t *datatype, int root, struct ompi_communicator_t *comm, mca_coll_base_module_t *module)
 Shared memory broadcast. More...
 

Function Documentation

int mca_coll_sm_bcast_intra ( void *  buff,
int  count,
struct ompi_datatype_t datatype,
int  root,
struct ompi_communicator_t comm,
mca_coll_base_module_t module 
)

Shared memory broadcast.

For the root, the general algorithm is to wait for a set of segments to become available. Once it is, the root claims the set by writing the current operation number and the number of processes using the set to the flag. The root then loops over the set of segments; for each segment, it copies a fragment of the user's buffer into the shared data segment and then writes the data size into its childrens' control buffers. The process is repeated until all fragments have been written.

For non-roots, for each set of buffers, they wait until the current operation number appears in the in-use flag (i.e., written by the root). Then for each segment, they wait for a nonzero to appear into their control buffers. If they have children, they copy the data from their parent's shared data segment into their shared data segment, and write the data size into each of their childrens' control buffers. They then copy the data from their shared [local] data segment into the user's output buffer. The process is repeated until all fragments have been received. If they do not have children, they copy the data directly from the parent's shared data segment into the user's output buffer.

References COPY_FRAGMENT_IN, FLAG_RETAIN, FLAG_SETUP, FLAG_WAIT_FOR_IDLE, mca_coll_sm_component, mca_coll_sm_comm_t::mcb_data_index, mca_coll_sm_comm_t::mcb_operation_count, mca_coll_sm_comm_t::mcb_tree, mca_coll_sm_tree_node_t::mcstn_children, mca_coll_sm_tree_node_t::mcstn_num_children, mca_coll_sm_tree_node_t::mcstn_parent, OBJ_CONSTRUCT, opal_atomic_wmb(), PARENT_NOTIFY_CHILDREN, mca_coll_sm_component_t::sm_comm_num_in_use_flags, mca_coll_sm_component_t::sm_fragment_size, mca_coll_sm_component_t::sm_segs_per_inuse_flag, and ompi_datatype_t::super.

Referenced by mca_coll_sm_allreduce_intra().