OpenMPI  0.1.1
connect.h
Go to the documentation of this file.
1 /*
2  * Copyright (c) 2007-2008 Cisco Systems, Inc. All rights reserved.
3  *
4  * $COPYRIGHT$
5  *
6  * Additional copyrights may follow
7  *
8  * $HEADER$
9  */
10 
11 /**
12  * @file
13  *
14  * This interface is designed to hide the back-end details of how IB
15  * RC connections are made from the rest of the openib BTL. There are
16  * module-like instances of the implemented functionality (dlopen and
17  * friends are not used, but all the functionality is accessed through
18  * struct's of function pointers, so you can swap between multiple
19  * different implementations at run time, just like real components).
20  * Hence, these entities are referred to as "Connect
21  * Pseudo-Components" (CPCs).
22  *
23  * The CPCs are referenced by their names (e.g., "oob", "rdma_cm").
24  *
25  * CPCs are split into components and modules, similar to all other
26  * MCA frameworks in this code base.
27  *
28  * Before diving into the CPC interface, let's discuss some
29  * terminology and mappings of data structures:
30  *
31  * - a BTL module represents a network port (in the case of the openib
32  * BTL, a LID)
33  * - a CPC module represents one way to make connections to a BTL module
34  * - hence, a BTL module has potentially multiple CPC modules
35  * associated with it
36  * - an endpoint represnts a connection between a local BTL module and
37  * a remote BTL module (in the openib BTL, because of BSRQ, an
38  * endpoint can contain multiple QPs)
39  * - when an endpoint is created, one of the CPC modules associated
40  * with the local BTL is selected and associated with the endpoint
41  * (obviously, it is a CPC module that is common between the local
42  * and remote BTL modules)
43  * - endpoints may be created and destroyed during the MPI job
44  * - endpoints are created lazily, during the first communication
45  * between two peers
46  * - endpoints are destroyed when two MPI processes become
47  * disconnected (e.g., MPI-2 dynamics or MPI_FINALIZE)
48  * - hence, BTL modules and CPC modules outlive endpoints.
49  * Specifically, BTL modules and CPC modules live from MPI_INIT to
50  * MPI_FINALIZE. endpoints come and go as MPI semantics demand it.
51  * - therefore, CPC modules need to cache information on endpoints that
52  * are specific to that connection.
53  *
54  * Component interface:
55  *
56  * - component_register(): The openib BTL's component_open() function
57  * calls the connect_base_register() function, which scans all
58  * compiled-in CPC's. If they have component_register() functions,
59  * they are called (component_register() functions are only allowed to
60  * register MCA parameters).
61  *
62  * NOTE: The connect_base_register() function will process the
63  * btl_openib_cpc_include and btl_openib_cpc_exclude MCA parameters
64  * and automatically include/exclude CPCs as relevant. If a CPC is
65  * excluded, none of its other interface functions will be invoked for
66  * the duration of the process.
67  *
68  * - component_init(): The openib BTL's component_init() function
69  * calls connect_base_init(), which will invoke this query function on
70  * each CPC to see if it wants to run at all. CPCs can gracefully
71  * remove themselves from consideration in this process by returning
72  * OMPI_ERR_NOT_SUPPORTED.
73  *
74  * - component_query(): The openib BTL's init_one_port() calls the
75  * connect_base_select_for_local_port() function, which, for each LID
76  * on that port, calls the component_query() function on every
77  * available CPC on that LID. This function is intended to see if a
78  * CPC can run on a sepcific openib BTL module (i.e., LID). If it
79  * can, the CPC is supposed to create a CPC module that is specific to
80  * that BTL/LID and return it. If it cannot, it should return
81  * OMPI_ERR_NOT_SUPPORTED and be gracefully skipped for this
82  * OpenFabrics port.
83  *
84  * component_finalize(): The openib BTL's component_close() function
85  * calls connect_base_finalize(), which, in turn, calls the
86  * component_finalize() function on all available CPCs. Note that all
87  * CPC modules will have been finalized by this point; the CPC
88  * component_finalize() function is a chance for the CPC to clean up
89  * any component-specific resources.
90  *
91  * Module interface:
92  *
93  * cbm_component member: A pointer pointing to the single, global
94  * instance of the CPC component. This member is used for creating a
95  * unique index representing the modules' component so that it can be
96  * shared with remote peer processes.
97  *
98  * cbm_priority member: An integer between 0 and 100, inclusive,
99  * representing the priority of this CPC.
100  *
101  * cbm_modex_message member: A pointer to a blob buffer that will be
102  * included in the modex message for this port for this CPC (it is
103  * assumed that this blob is a) only understandable by the
104  * corresponding CPC in the peer process, and b) contains specific
105  * addressing/contact information for *this* port's CPC module).
106  *
107  * cbm_modex_message_len member: The length of the cbm_modex_message
108  * blob, in bytes.
109  *
110  * cbm_endpoint_init(): Called during endpoint creation, allowing a
111  * CPC module to cache information on the endpoint. A pointer to the
112  * endpoint's CPC module is already cached on the endpoint.
113  *
114  * cbm_start_connect(): initiate a connection to a remote peer. The
115  * CPC is responsible for setting itself up for asyncronous operation
116  * for progressing the outgoing connection request.
117  *
118  * cbm_endpoint_finalize(): Called during the endpoint destrouction,
119  * allowing the CPC module to destroy anything that it cached on the
120  * endpoint.
121  *
122  * cbm_finalize(): shut down all asynchronous handling and clean up
123  * any state that was setup for this CPC module/BTL. Some CPCs setup
124  * asynchronous support on a per-HCA/NIC basis (vs. per-port/LID). It
125  * is the reponsibility of the CPC to figure out such issues (e.g.,
126  * via reference counting) -- there is no notification from the
127  * upper-level BTL about when an entire HCA/NIC is no longer being
128  * used. There is only this function, which tells when a specific
129  * CPC/BTL module is no longer being used.
130  *
131  * cbm_uses_cts: a bool that indicates whether the CPC will use the
132  * CTS protocol or not.
133  * - if true: the CPC will post the fragment on
134  * endpoint->endpoint_cts_frag as a receive buffer and will *not*
135  * call ompi_btl_openib_post_recvs().
136  * - if false: the CPC will call ompi_btl_openib_post_recvs() before
137  * calling ompi_btl_openib_cpc_complete().
138  *
139  * There are two functions in the main openib BTL that the CPC may
140  * call:
141  *
142  * - ompi_btl_openib_post_recvs(endpoint): once a QP is locally
143  * connected to the remote side (but we don't know if the remote side
144  * is connected to us yet), this function is invoked to post buffers
145  * on the QP, setup credits for the endpoint, etc. This function is
146  * *only* invoked if the CPC's cbm_uses_cts is false.
147  *
148  * - ompi_btl_openib_cpc_complete(endpoint): once that a CPC knows
149  * that a QP is connected on *both* sides, this function is invoked to
150  * tell the main openib BTL "ok, you can use this connection now."
151  * (e.g., the main openib BTL will either invoke the CTS protocol or
152  * start sending out fragments that were queued while the connection
153  * was establishing, etc.).
154  */
155 #ifndef BTL_OPENIB_CONNECT_H
156 #define BTL_OPENIB_CONNECT_H
157 
158 BEGIN_C_DECLS
159 
160 #define BCF_MAX_NAME 64
161 
162 /**
163  * Must forward declare these structs to avoid include file loops.
164  */
165 struct mca_btl_openib_hca_t;
168 
169 /**
170  * This is struct is defined below
171  */
173 
174 /************************************************************************/
175 
176 /**
177  * Function to register MCA params in the connect functions. It
178  * returns no value, so it cannot fail.
179  */
181 
182 /**
183  * This function is invoked once by the openib BTL component during
184  * startup. It is intended to have CPC component-wide startup.
185  *
186  * Return value:
187  *
188  * - OMPI_SUCCESS: this CPC component will be used in selection during
189  * this process.
190  *
191  * - OMPI_ERR_NOT_SUPPORTED: this CPC component will be silently
192  * ignored in this process.
193  *
194  * - Other OMPI_ERR_* values: the error will be propagated upwards,
195  * likely causing a fatal error (and/or the openib BTL component
196  * being ignored).
197  */
199 
200 /**
201  * Query the CPC to see if it wants to run on a specific port (i.e., a
202  * specific BTL module). If the component init function previously
203  * returned OMPI_SUCCESS, this function is invoked once per BTL module
204  * creation (i.e., for each port found by an MPI process). If this
205  * CPC wants to be used on this BTL module, it returns a CPC module
206  * that is specific to this BTL module.
207  *
208  * The BTL module in question is passed to the function; all of its
209  * attributes can be used to query to see if it's eligible for this
210  * CPC.
211  *
212  * If it is eligible, the CPC is responsible for creating a
213  * corresponding CPC module, filling in all the relevant fields on the
214  * modules, and for setting itself up to run (per above) and returning
215  * a CPC module (this is effectively the "module_init" function).
216  * Note that the module priority must be between 0 and 100
217  * (inclusive). When multiple CPCs are eligible for a single module,
218  * the CPC with the highest priority will be used.
219  *
220  * Return value:
221  *
222  * - OMPI_SUCCESS if this CPC is eligible for and was able to be setup
223  * for this BTL module. It is assumed that the CPC is now completely
224  * setup to run on this openib module (per description above).
225  *
226  * - OMPI_ERR_NOT_SUPPORTED if this CPC cannot support this BTL
227  * module. This is not an error; it's just the CPC saying "sorry, I
228  * cannot support this BTL module."
229  *
230  * - Other OMPI_ERR_* code: an error occurred.
231  */
235 
236 /**
237  * This function is invoked once by the openib BTL component during
238  * shutdown. It is intended to have CPC component-wide shutdown.
239  */
241 
242 /**
243  * CPC component struct
244  */
246  /** Name of this set of connection functions */
247  char cbc_name[BCF_MAX_NAME];
248 
249  /** Register function. Can be NULL. */
251 
252  /** CPC component init function. Can be NULL. */
254 
255  /** Query the CPC component to get a CPC module corresponding to
256  an openib BTL module. Cannot be NULL. */
258 
259  /** CPC component finalize function. Can be NULL. */
261 };
262 /**
263  * Convenience typedef
264  */
266 
267 /************************************************************************/
268 
269 /**
270  * Function called when an endpoint has been created and has been
271  * associated with a CPC.
272  */
274  (struct mca_btl_base_endpoint_t *endpoint);
275 
276 /**
277  * Function to initiate a connection to a remote process.
278  */
281  struct mca_btl_base_endpoint_t *endpoint);
282 
283 /**
284  * Function called when an endpoint is being destroyed.
285  */
287  (struct mca_btl_base_endpoint_t *endpoint);
288 
289 /**
290  * Function to finalize the CPC module. It is called once when the
291  * CPC module's corresponding openib BTL module is being finalized.
292  */
296 
297 /**
298  * Meta data about a CPC module. This is in a standalone struct
299  * because it is used in both the CPC module struct and the
300  * openib_btl_proc_t struct to hold information received from the
301  * modex.
302  */
304  /** Pointer back to the component. Used by the base and openib
305  btl to calculate this module's index for the modex. */
307 
308  /** Priority of the CPC module (must be >=0 and <=100) */
309  uint8_t cbm_priority;
310 
311  /** Blob that the CPC wants to include in the openib modex message
312  for a specific port, or NULL if the CPC does not want to
313  include a message in the modex. */
315 
316  /** Length of the cbm_modex_message blob (0 if
317  cbm_modex_message==NULL). The message is intended to be short
318  (because the size of the modex broadcast is a function of
319  sum(cbm_modex_message_len[i]) for
320  i=(0...total_num_ports_in_MPI_job) -- e.g., IBCM imposes its
321  own [very short] limits (per IBTA volume 1, chapter 12). */
324 
325 /**
326  * Struct for holding CPC module and associated meta data
327  */
329  /** Meta data about the module */
331 
332  /** Endpoint initialization function */
334 
335  /** Connect function */
337 
338  /** Endpoint finalization function */
340 
341  /** Finalize the cpc module */
343 
344  /** Whether this module will use the CTS protocol or not. This
345  directly states whether this module will call
346  mca_btl_openib_endpoint_post_recvs() or not: true = this
347  module will *not* call _post_recvs() and instead will post the
348  receive buffer provided at endpoint->endpoint_cts_frag on qp
349  0. */
352 
353 END_C_DECLS
354 
355 #endif
void * cbm_modex_message
Blob that the CPC wants to include in the openib modex message for a specific port, or NULL if the CPC does not want to include a message in the modex.
Definition: connect.h:314
char cbc_name[BCF_MAX_NAME]
Name of this set of connection functions.
Definition: connect.h:247
ompi_btl_openib_connect_base_component_t * cbm_component
Pointer back to the component.
Definition: connect.h:306
ompi_btl_openib_connect_base_module_data_t data
Meta data about the module.
Definition: connect.h:330
ompi_btl_openib_connect_base_module_endpoint_init_fn_t cbm_endpoint_init
Endpoint initialization function.
Definition: connect.h:333
bool cbm_uses_cts
Whether this module will use the CTS protocol or not.
Definition: connect.h:350
ompi_btl_openib_connect_base_module_start_connect_fn_t cbm_start_connect
Connect function.
Definition: connect.h:336
ompi_btl_openib_connect_base_module_finalize_fn_t cbm_finalize
Finalize the cpc module.
Definition: connect.h:342
IB BTL Interface.
Definition: btl_openib.h:432
int(* ompi_btl_openib_connect_base_component_finalize_fn_t)(void)
This function is invoked once by the openib BTL component during shutdown.
Definition: connect.h:240
int(* ompi_btl_openib_connect_base_module_finalize_fn_t)(struct mca_btl_openib_module_t *btl, struct ompi_btl_openib_connect_base_module_t *cpc)
Function to finalize the CPC module.
Definition: connect.h:294
int(* ompi_btl_openib_connect_base_func_component_query_t)(struct mca_btl_openib_module_t *btl, struct ompi_btl_openib_connect_base_module_t **cpc)
Query the CPC to see if it wants to run on a specific port (i.e., a specific BTL module).
Definition: connect.h:233
ompi_btl_openib_connect_base_component_register_fn_t cbc_register
Register function.
Definition: connect.h:250
struct ompi_btl_openib_connect_base_module_data_t ompi_btl_openib_connect_base_module_data_t
Meta data about a CPC module.
ompi_btl_openib_connect_base_func_component_query_t cbc_query
Query the CPC component to get a CPC module corresponding to an openib BTL module.
Definition: connect.h:257
void(* ompi_btl_openib_connect_base_component_register_fn_t)(void)
Function to register MCA params in the connect functions.
Definition: connect.h:180
uint8_t cbm_modex_message_len
Length of the cbm_modex_message blob (0 if cbm_modex_message==NULL).
Definition: connect.h:322
ompi_btl_openib_connect_base_component_init_fn_t cbc_init
CPC component init function.
Definition: connect.h:253
int(* ompi_btl_openib_connect_base_module_endpoint_finalize_fn_t)(struct mca_btl_base_endpoint_t *endpoint)
Function called when an endpoint is being destroyed.
Definition: connect.h:287
Struct for holding CPC module and associated meta data.
Definition: connect.h:328
uint8_t cbm_priority
Priority of the CPC module (must be >=0 and <=100)
Definition: connect.h:309
ompi_btl_openib_connect_base_component_finalize_fn_t cbc_finalize
CPC component finalize function.
Definition: connect.h:260
State of ELAN endpoint connection.
Definition: btl_elan_endpoint.h:33
Meta data about a CPC module.
Definition: connect.h:303
int(* ompi_btl_openib_connect_base_component_init_fn_t)(void)
This function is invoked once by the openib BTL component during startup.
Definition: connect.h:198
struct ompi_btl_openib_connect_base_module_t ompi_btl_openib_connect_base_module_t
Struct for holding CPC module and associated meta data.
int(* ompi_btl_openib_connect_base_module_start_connect_fn_t)(struct ompi_btl_openib_connect_base_module_t *cpc, struct mca_btl_base_endpoint_t *endpoint)
Function to initiate a connection to a remote process.
Definition: connect.h:280
CPC component struct.
Definition: connect.h:245
ompi_btl_openib_connect_base_module_endpoint_finalize_fn_t cbm_endpoint_finalize
Endpoint finalization function.
Definition: connect.h:339
int(* ompi_btl_openib_connect_base_module_endpoint_init_fn_t)(struct mca_btl_base_endpoint_t *endpoint)
Function called when an endpoint has been created and has been associated with a CPC.
Definition: connect.h:274