1 / 3

Revoke / Incarnation #s / Matching

Revoke / Incarnation #s / Matching. Discussion around how to reclaim context IDs (resources that are a part of message matching) after an MPI_Comm_revoke Basic problem: revoke is one-sided and can be called by multiple processes in the communicator

ailsa
Télécharger la présentation

Revoke / Incarnation #s / Matching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Revoke / Incarnation #s / Matching • Discussion around how to reclaim context IDs (resources that are a part of message matching) after an MPI_Comm_revoke • Basic problem: revoke is one-sided and can be called by multiple processes in the communicator • There is a race between calling revoke and when all correct processes update their local state to revoked • Need to ensure that all processes have revoked the communicator before context ID can be reused • Scenario: • Communicator with correct processes A, B, and C is revoked • A and B free revoked communicator and create a new communicator using the old context ID • C calls revoke on the old communicator -- what happens at A and B? • OR -- C sends a message to A/B who has posted an ANY_SOURCE receive -- does it match? • Several solutions were discussed: • Incarnation number -- An additional number on each context ID that becomes a part of the matching • Group guards -- Check incoming messages to ensure that the sender is in the group of the communicator • Fault tolerant MPI_Comm_free/create -- Enhance create/free algorithms to quiesce context IDs before they are used

  2. RMA Semantics • Pavan raised a concern about the definition of RMA window memory in the context of shared memory windows • It may be impossible to guarantee that only locations updated in the window are invalid • Suggested weakening the semantic to the entire window being undefined • Requires further discussion

  3. Shared Memory • What happens if a process with shared memory goes down and another process has posted messages using its shared memory? • Yes this is an implementation issue, but is it possible to do anything?

More Related