fixed mem allocation in main by vkolodie · Pull Request #3 · pavanbalaji/mpich

vkolodie · 2017-05-05T13:15:36Z

Fixed place for mallocs, where proxy.num_children is known and changed the number of elements to proxy_params.immediate.proxy.num_children + 1 since "0" element is used for calling proxy itself.

raffenet · 2017-08-08T16:10:46Z

Applied to the wip/pmi branch. This can be closed.

Calculate and allocate the number of iovecs that will be used by the iovec state machine to generate the put/get/accumulate lists. Currently, we are setting the FI_ASYNC_IOV mode bit, which tells libfabric that we (netmod) will provide the storage for all the iovec operations. However, we currently do not provide that storage, and some providers appear to be providing the storage for iovecs anyways, so we have a "double bug" that makes this problem rarely manifest. There are a couple ways to fix this: 1) Disable FI_ASYNC_IOV. Probably not an optimal option. We want to avoid internal allocations if possible. If allocations are required anywhere in the stack, probably the best place is in the netmod because we have the highest level of datatype metadata. I think what we really want to ask libfabric is if a memory copy of the iovec is required. In this case, we can allocate it. If the hardware is going to copy the iovec as part of the command, we can use a "per-op" allocation and bypass this calculation entirely. 2) Use a "per-op" allocation of the iovec, and use fi_context to complete. This will probably be slow, as injection should be message rate bound on fast networks. By using completions instead of counters, we avoid allocation/free of per element storage and the associated overhead. The current scheme only uses 1 allocation. 3) Allocate the required iovecs up front. This implements #3, but with some optimizations. To count the total iovecs that will be required, some estimates are used: * We scan two elements from each source datatype and extraplate the total iovec count from that. * We sum the iovec lists as an upper bound, rather than calculate the exact use. We'll calculate this in an exact manner when we replace the existing iovec expansion with direct processing of the datatype. We should only be using iovecs we actually touch, so in practice this shouldn't be a problem. Fixes csr/mpich-opa#409 Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

fixed mem allocation in main

cd22064

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fixed mem allocation in main#3

fixed mem allocation in main#3
vkolodie wants to merge 1 commit intopavanbalaji:wip/pmifrom
vkolodie:mem_alloc_in_proxy

vkolodie commented May 5, 2017

Uh oh!

raffenet commented Aug 8, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vkolodie commented May 5, 2017

Uh oh!

raffenet commented Aug 8, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants