Quegel Communication Primitives

All the communication primitives of our system are defined in utils/communication.h. These functions are implemented using MPI, and are called by the system (e.g., in worker's run() function). If you want to change/extend the logic of our system, you may also need to call these functions.

The first set of functions aggregate the value of each worker, and returns the aggregated value.

Every worker calls: int all_sum(int my_copy)

Every worker calls: long long master_sum_LL(long long my_copy)

Every worker calls: long long all_sum_LL(long long my_copy)

Every worker calls: all_bor(char my_copy)

For example, in function WorkerOL::run() of ol/WorkerOL.h, evey worker executes all_sum(vertexes.size()) to sum up the number of vertices in every worker, which returns the total number of vertices. For master_sum_LL(.), the return value is only valid for the master process.

The second set of functions are called by the workers to exchange contents:

Every worker calls: all_to_all(vector& to_exchange)

Every worker calls: all_to_all(vector& to_exchange1, vector& to_exchange2)

Every worker calls: all_to_all(vector& to_exchange1, vector& to_exchange2, vector& to_exchange3)

Every worker calls: all_to_all(vector& to_send, vector& to_get)

Before calling the first function, one should make sure that to_exchange contains N elements of type T (N is the number of workers), where to_exchange[i] stores the object to be sent to worker i. After the function returns, to_exchange[i] stores the object received from worker i. The next two functions are similar: the content in to_exchange1, to_exchange2 and to_exchange3 are all exchanged.

The last function is similar, except that the contents to send are stored in to_send, and the contents received are stored in to_get. However, for worker i, to_get[i] is empty and one needs to get the content from to_send[i]. If T is different from T1, we require that the serial representation of a T-typed object be the same as that of a T1-typed object (e.g., vector and hash_set have the same serial representation according to utils/serialization.h).

These functions are used by the system to implement the message passing logic, see ol/MessageBufferOL.h for more details.

The final set of functions are called between master and the slaves to gather/scatter information:

Master calls: masterScatter(vector<T>& to_send);        Slave calls: slaveScatter(T& to_get)

Master calls: masterBcast(T& to_send);        Slave calls: slaveBcast(T& to_get)

Master calls: masterGather(vector<T>& to_get);        Slave calls: slaveGather(T& to_send)

The first pair of functions are called to let the master distribute contents to each worker. For the master, before calling masterScatter(.), to_send[i] stores the object to be sent to worker i. For each slave, after calling slaveScatter(.), to_get stores the object received from the master.

The second pair of functions are similar, except that the master sends the same content to_send to all slaves.

The third pair of functions are called to let the master gather contents from each worker. For each slave, before calling slaveGather(.), to_send stores the object to be sent to the master. For the master, after calling masterGather(.), to_get[i] stores the object received from worker i.

These functions are used by the system to implement the aggregator logic (and query synchronization), see WorkerOL::compute_dump() (and WorkerOL::update_tasks()) in ol/WorkerOL.h for more details.