diff options
Diffstat (limited to 'Documentation/infiniband/tag_matching.txt')
-rw-r--r-- | Documentation/infiniband/tag_matching.txt | 64 |
1 files changed, 0 insertions, 64 deletions
diff --git a/Documentation/infiniband/tag_matching.txt b/Documentation/infiniband/tag_matching.txt deleted file mode 100644 index d2a3bf819226..000000000000 --- a/Documentation/infiniband/tag_matching.txt +++ /dev/null @@ -1,64 +0,0 @@ -Tag matching logic - -The MPI standard defines a set of rules, known as tag-matching, for matching -source send operations to destination receives. The following parameters must -match the following source and destination parameters: -* Communicator -* User tag - wild card may be specified by the receiver -* Source rank – wild car may be specified by the receiver -* Destination rank – wild -The ordering rules require that when more than one pair of send and receive -message envelopes may match, the pair that includes the earliest posted-send -and the earliest posted-receive is the pair that must be used to satisfy the -matching operation. However, this doesn’t imply that tags are consumed in -the order they are created, e.g., a later generated tag may be consumed, if -earlier tags can’t be used to satisfy the matching rules. - -When a message is sent from the sender to the receiver, the communication -library may attempt to process the operation either after or before the -corresponding matching receive is posted. If a matching receive is posted, -this is an expected message, otherwise it is called an unexpected message. -Implementations frequently use different matching schemes for these two -different matching instances. - -To keep MPI library memory footprint down, MPI implementations typically use -two different protocols for this purpose: - -1. The Eager protocol- the complete message is sent when the send is -processed by the sender. A completion send is received in the send_cq -notifying that the buffer can be reused. - -2. The Rendezvous Protocol - the sender sends the tag-matching header, -and perhaps a portion of data when first notifying the receiver. When the -corresponding buffer is posted, the responder will use the information from -the header to initiate an RDMA READ operation directly to the matching buffer. -A fin message needs to be received in order for the buffer to be reused. - -Tag matching implementation - -There are two types of matching objects used, the posted receive list and the -unexpected message list. The application posts receive buffers through calls -to the MPI receive routines in the posted receive list and posts send messages -using the MPI send routines. The head of the posted receive list may be -maintained by the hardware, with the software expected to shadow this list. - -When send is initiated and arrives at the receive side, if there is no -pre-posted receive for this arriving message, it is passed to the software and -placed in the unexpected message list. Otherwise the match is processed, -including rendezvous processing, if appropriate, delivering the data to the -specified receive buffer. This allows overlapping receive-side MPI tag -matching with computation. - -When a receive-message is posted, the communication library will first check -the software unexpected message list for a matching receive. If a match is -found, data is delivered to the user buffer, using a software controlled -protocol. The UCX implementation uses either an eager or rendezvous protocol, -depending on data size. If no match is found, the entire pre-posted receive -list is maintained by the hardware, and there is space to add one more -pre-posted receive to this list, this receive is passed to the hardware. -Software is expected to shadow this list, to help with processing MPI cancel -operations. In addition, because hardware and software are not expected to be -tightly synchronized with respect to the tag-matching operation, this shadow -list is used to detect the case that a pre-posted receive is passed to the -hardware, as the matching unexpected message is being passed from the hardware -to the software. |