OpenSHMEM
Version and Vendor Query
Return the major and minor version of the library implementation. |
|
Return the name string of the library implementation. |
- shmem4py.shmem.info_get_version()
Return the major and minor version of the library implementation.
Library Setup and Exit
|
Allocate and initialize the needed resources. |
|
Release all the used resources. |
|
Force termination of an entire program. |
|
Initialize the library with support for the provided thread level. |
Return the level of thread support provided by the library. |
|
|
Threading support levels. |
- shmem4py.shmem.init()
Allocate and initialize the needed resources. Collective.
All PEs must call this routine, or
init_thread
, before any other OpenSHMEM routine. It must be matched with a call tofinalize
at the end of the program.- Return type:
- shmem4py.shmem.finalize()
Release all the used resources. Collective.
This only terminates the shmem portion of a program, not the entire program. All processes that represent the PEs will still exist after the call to
finalize
returns, but they will no longer have access to resources that have been released.- Return type:
- shmem4py.shmem.global_exit(status=0)
Force termination of an entire program. Can be called by any PE.
- shmem4py.shmem.init_thread(requested=THREAD_MULTIPLE)
Initialize the library with support for the provided thread level.
Either
init
orinit_thread
should be used to initialize the program.
- shmem4py.shmem.query_thread()
Return the level of thread support provided by the library.
- Return type:
- class shmem4py.shmem.THREAD(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Threading support levels.
Accessibility Queries
|
Return the number of the calling PE. |
|
Return the number of PEs running in a program. |
|
Return whether a PE is accessible. |
|
Return whether a local array is accessible from the specified remote PE. |
|
Return a local view to a symmetric array on the specified PE. |
- shmem4py.shmem.pe_accessible(pe)
Return whether a PE is accessible.
- shmem4py.shmem.addr_accessible(addr, pe)
Return whether a local array is accessible from the specified remote PE.
Memory Management
|
Return memory allocated from the symmetric heap. |
|
Deallocate memory of |
|
Return a NumPy array interpreted from the buffer allocated in the symmetric memory. |
|
Return a new NumPy array allocated in the symmetric memory. |
|
Delete the array. |
|
Return a new NumPy array allocated in the symmetric memory and initialize contents with |
|
Return a new empty NumPy array allocated in the symmetric memory. |
|
Return a new |
|
Return a new |
|
Return a new |
|
Memory allocation hints. |
- shmem4py.shmem.alloc(count, size=1, align=None, hints=None, clear=True)
Return memory allocated from the symmetric heap.
- Parameters:
count (int) – Number of elements to allocate.
size (int) – Size of each element in bytes.
align (int | None) – Byte alignment of the block allocated from the symmetric heap.
hints (int | None) – A bit array of hints provided by the user to the implementation. Valid hints are defined as enumerations in
MALLOC
and can be combined using the bitwise OR operator.clear (bool) – If
True
, the allocated memory is cleared to zero.
- Return type:
- shmem4py.shmem.free(mem)
Deallocate memory of
mem
.- Parameters:
mem (memoryview | NDArray[Any]) – The object to be deallocated.
- Return type:
- shmem4py.shmem.fromalloc(mem, shape=None, dtype=None, order='C')
Return a NumPy array interpreted from the buffer allocated in the symmetric memory.
- Parameters:
mem (memoryview) – The memory to be interpreted as a NumPy array.
shape (int | Sequence[int] | None) – The shape of the array. If
None
, the shape is inferred from the size of the memory.dtype (DTypeLike) – The data type of the array. If
None
, the data type is inferred from the memory contents.order (Literal['C', 'F']) – The memory layout of the array. If
'C'
, the array is contiguous in memory (row major). If'F'
, the array is Fortran contiguous (column major).
- Return type:
- shmem4py.shmem.new_array(shape, dtype=float, order='C', *, align=None, hints=None, clear=True)
Return a new NumPy array allocated in the symmetric memory.
- Parameters:
dtype (DTypeLike) – The data type of the array.
order (Literal['C', 'F']) – The memory layout of the array. If
'C'
, the array is contiguous in memory (row major). If'F'
, the array is Fortran contiguous (column major).align (int | None) – Byte alignment of the block allocated in the symmetric memory. Keyword argument only.
hints (int | None) – A bit array of hints provided by the user to the implementation. Valid hints are defined as enumerations in
MALLOC
and can be combined using the bitwise OR operator. Keyword argument only.clear (bool) – If
True
, the allocated memory is cleared to zero. Keyword argument only.
- Return type:
- shmem4py.shmem.del_array(a)
Delete the array.
- shmem4py.shmem.array(obj, dtype=None, *, order='K', align=None, hints=None)
Return a new NumPy array allocated in the symmetric memory and initialize contents with
obj
.- Parameters:
obj (Any) – The object from which a NumPy array is to be initialized.
dtype (DTypeLike) – The data type of the array. If
None
, the data type is inferred from the memory contents.order (Literal['K', 'A', 'C', 'F']) – The memory layout of the array. See
numpy.array
for the explanation of the options. Keyword argument only.align (int | None) – Byte alignment of the block allocated in the symmetric memory. Keyword argument only.
hints (int | None) – A bit array of hints provided by the user to the implementation. Valid hints are defined as enumerations in
MALLOC
and can be combined using the bitwise OR operator. Keyword argument only.
- Return type:
- shmem4py.shmem.empty(shape, dtype=float, order='C', *, align=None, hints=None)
Return a new empty NumPy array allocated in the symmetric memory.
- Parameters:
dtype (DTypeLike) – The data type of the array.
order (Literal['C', 'F']) – The memory layout of the array. If
'C'
, the array is contiguous in memory (row major). If'F'
, the array is Fortran contiguous (column major).align (int | None) – Byte alignment of the block allocated in the symmetric memory. Keyword argument only.
hints (int | None) – A bit array of hints provided by the user to the implementation. Valid hints are defined as enumerations in
MALLOC
and can be combined using the bitwise OR operator. Keyword argument only.
- Return type:
- shmem4py.shmem.zeros(shape, dtype=float, order='C', *, align=None, hints=None)
Return a new
0
-initialized NumPy array allocated in the symmetric memory.- Parameters:
dtype (DTypeLike) – The data type of the array.
order (Literal['C', 'F']) – The memory layout of the array. If
'C'
, the array is contiguous in memory (row major). If'F'
, the array is Fortran contiguous (column major).align (int | None) – Byte alignment of the block allocated in the symmetric memory. Keyword argument only.
hints (int | None) – A bit array of hints provided by the user to the implementation. Valid hints are defined as enumerations in
MALLOC
and can be combined using the bitwise OR operator. Keyword argument only.
- Return type:
- shmem4py.shmem.ones(shape, dtype=float, order='C', *, align=None, hints=None)
Return a new
1
-initialized NumPy array allocated in the symmetric memory.- Parameters:
dtype (DTypeLike) – The data type of the array.
order (Literal['C', 'F']) – The memory layout of the array. If
'C'
, the array is contiguous in memory (row major). If'F'
, the array is Fortran contiguous (column major).align (int | None) – Byte alignment of the block allocated in the symmetric memory. Keyword argument only.
hints (int | None) – A bit array of hints provided by the user to the implementation. Valid hints are defined as enumerations in
MALLOC
and can be combined using the bitwise OR operator. Keyword argument only.
- Return type:
- shmem4py.shmem.full(shape, fill_value, dtype=None, order='C', *, align=None, hints=None)
Return a new
fill_value
-initialized NumPy array allocated in the symmetric memory.- Parameters:
fill_value (int | float | complex | number) – The value to fill the array with.
dtype (DTypeLike) – The data type of the array.
order (Literal['C', 'F']) – The memory layout of the array. If
'C'
, the array is contiguous in memory (row major). If'F'
, the array is Fortran contiguous (column major).align (int | None) – Byte alignment of the block allocated in the symmetric memory. Keyword argument only.
hints (int | None) – A bit array of hints provided by the user to the implementation. Valid hints are defined as enumerations in
MALLOC
and can be combined using the bitwise OR operator. Keyword argument only.
- Return type:
- class shmem4py.shmem.MALLOC(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Memory allocation hints.
Team Management
|
Team management. |
Destroy the team. |
|
|
Return a new team from a subset of the existing parent team PEs. |
Return the configuration parameters of the team. |
|
Return the number of the calling PE within the team. |
|
Return the number of PEs in the team. |
|
|
Translate a given PE number from one team to the corresponding PE number in another team. |
|
Create a communication context from the team. |
Register the arrival of a PE at a synchronization point. |
- class shmem4py.shmem.Team(team=None)
Team management.
- Parameters:
team (Optional[Union[Team, TeamHandle]])
- Return type:
- split_strided(start=0, stride=1, size=None, config=None, **kwargs)
Return a new team from a subset of the existing parent team PEs.
This routine must be called by all PEs in the parent team.
- Parameters:
start (int) – The lowest PE number of the subset of PEs from the parent team that will form the new team.
stride (int) – The stride between team PE numbers in the parent team that comprise the subset of PEs that will form the new team.
size (int | None) – The number of PEs from the parent team in the subset of PEs that will form the new team. If
None
, the size is automatically determined.config (Mapping[str, int] | None) – Configuration parameters for the new team. Currently, only
SHMEM_TEAM_NUM_CONTEXTS
key is supported.**kwargs (int) – Additional configuration parameters for the new team.
- Return type:
- translate_pe(pe=None, team=None)
Translate a given PE number from one team to the corresponding PE number in another team.
- create_ctx(options=0)
Create a communication context from the team.
Communication Management
|
Communication context. |
|
Return a new communication context. |
Destroy the communication context. |
|
Retrieve the team associated with the communication context. |
|
Ensure ordering of delivery of operations on symmetric data objects. |
|
Wait for completion of outstanding operations on symmetric data objects issued by a PE. |
|
|
Context creation options. |
- class shmem4py.shmem.Ctx(ctx=None)
Communication context.
- static create(options=0, team=None)
Return a new communication context.
- Parameters:
options (int) – The set of options requested for the given context. Valid options are the enumerations listed in the
CTX
class. Multiple options may be requested by combining them with a bitwise OR operation.0
can be used if no options are requested.team (Team | None) – If the team is specified, the communication context is created from this
team
.
- Return type:
- fence()
Ensure ordering of delivery of operations on symmetric data objects.
All operations on symmetric data objects issued to a particular PE on the given context prior to the call to
fence
are guaranteed to be delivered before any subsequent operations on symmetric data objects to the same PE.- Return type:
Remote Memory Access
|
Copy data from local |
|
Copy data from |
|
Copy strided data from local |
|
Copy strided data from |
|
Nonblocking copy data from local |
|
Nonblocking copy data from |
- shmem4py.shmem.put(target, source, pe, size=None, ctx=None)
Copy data from local
source
totarget
on PEpe
.- Parameters:
- Return type:
- shmem4py.shmem.get(target, source, pe, size=None, ctx=None)
Copy data from
source
on PEpe
to localtarget
.- Parameters:
- Return type:
- shmem4py.shmem.iput(target, source, pe, tst=1, sst=1, size=None, ctx=None)
Copy strided data from local
source
totarget
on PEpe
.- Parameters:
target (NDArray[T]) – Symmetric destination array.
source (NDArray[T]) – Local array containing the data to be copied.
pe (int) – PE number of the remote PE.
tst (int) – The stride between consecutive elements of the
target
array. The stride is scaled by the element size of thetarget
array. A value of1
indicates contiguous data.sst (int) – The stride between consecutive elements of the
source
array. The stride is scaled by the element size of thesource
array. A value of1
indicates contiguous data.ctx (Ctx | None) – A context handle specifying the context on which to perform the operation.
- Return type:
- shmem4py.shmem.iget(target, source, pe, tst=1, sst=1, size=None, ctx=None)
Copy strided data from
source
on PEpe
to localtarget
.- Parameters:
target (NDArray[T]) – Local array to be updated.
source (NDArray[T]) – Symmetric source array.
pe (int) – PE number of the remote PE.
tst (int) – The stride between consecutive elements of the
target
array. The stride is scaled by the element size of thetarget
array. A value of1
indicates contiguous data.sst (int) – The stride between consecutive elements of the
source
array. The stride is scaled by the element size of thesource
array. A value of1
indicates contiguous data.ctx (Ctx | None) – A context handle specifying the context on which to perform the operation.
- Return type:
- shmem4py.shmem.put_nbi(target, source, pe, size=None, ctx=None)
Nonblocking copy data from local
source
totarget
on PEpe
.- Parameters:
- Return type:
- shmem4py.shmem.get_nbi(target, source, pe, size=None, ctx=None)
Nonblocking copy data from
source
on PEpe
to localtarget
.- Parameters:
- Return type:
Atomic Memory Operations
|
Perform operation |
|
Perform operation |
|
Perform operation |
|
Atomic Memory Operations. |
|
Write |
|
Increment |
|
Add |
|
Perform bitwise AND on |
|
Perform bitwise OR on |
|
Perform bitwise XOR on |
|
Return the value of a |
|
Write |
|
Conditionally update |
|
Increment |
|
Add |
|
Perform a bitwise AND on |
|
Perform a bitwise OR on |
|
Perform a bitwise XOR on |
|
Fetch the value of |
|
Write |
|
Conditionally update |
|
Increment |
|
Add |
|
Perform bitwise AND on |
|
Perform bitwise OR on |
|
Perform bitwise XOR on |
- shmem4py.shmem.atomic_op(target, value, op, pe, ctx=None)
Perform operation
op
onvalue
andtarget
on PEpe
.- Parameters:
target (NDArray[Any]) – Symmetric array of size
1
containing the destination value.value (int | float | complex | number) – The operand to the operation.
op (AMO) – The operation to be performed.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_fetch_op(target, value, op, pe, ctx=None)
Perform operation
op
onvalue
andtarget
on PEpe
and returntarget
’s prior value.- Parameters:
target (NDArray[Any]) – Symmetric array of size
1
containing the destination value.value (int | float | complex | number) – The operand to the operation.
op (AMO) – The operation to be performed.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_fetch_op_nbi(fetch, target, value, op, pe, ctx=None)
Perform operation
op
onvalue
andtarget
on PEpe
and fetchtarget
’s prior value tofetch
.- Parameters:
fetch (NDArray[T]) – Local array of size
1
to be updated.target (NDArray[T]) – Symmetric array of size
1
containing the destination value.value (int | float | complex | number) – The operand to the operation.
op (AMO) – The operation to be performed.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- class shmem4py.shmem.AMO(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Atomic Memory Operations.
- shmem4py.shmem.atomic_set(target, value, pe, ctx=None)
Write
value
intotarget
on PEpe
.- Parameters:
target (NDArray[Any]) – Symmetric array of size
1
where data will be written.value (int | float | complex | number) – The operand to the atomic set operation.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_inc(target, pe, ctx=None)
Increment
target
array element on PEpe
.
- shmem4py.shmem.atomic_add(target, value, pe, ctx=None)
Add
value
totarget
on PEpe
and atomically updatetarget
.- Parameters:
target (NDArray[Any]) – Symmetric array of size
1
containing the element that will be modified.value (int | float | complex | number) – The operand to the atomic add operation.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_and(target, value, pe, ctx=None)
Perform bitwise AND on
value
andtarget
on PEpe
.- Parameters:
target (NDArray[Any]) – Symmetric array of size
1
containing the element that will be modified.value (int | float | complex | number) – The operand to the bitwise AND operation.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_or(target, value, pe, ctx=None)
Perform bitwise OR on
value
andtarget
on PEpe
.- Parameters:
target (NDArray[Any]) – Symmetric array of size
1
containing the element that will be modified.value (int | float | complex | number) – The operand to the bitwise OR operation.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_xor(target, value, pe, ctx=None)
Perform bitwise XOR on
value
andtarget
on PEpe
.- Parameters:
target (NDArray[Any]) – Symmetric array of size
1
containing the element that will be modified.value (int | float | complex | number) – The operand to the bitwise XOR operation.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_fetch(source, pe, ctx=None)
Return the value of a
source
on PEpe
.- Parameters:
- Return type:
- shmem4py.shmem.atomic_swap(target, value, pe, ctx=None)
Write
value
intotarget
on PEpe
and return the prior value.- Parameters:
target (NDArray[Any]) – Symmetric array of size
1
containing the destination value.value (int | float | complex | number) – The value to be atomically written to the remote PE.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_compare_swap(target, cond, value, pe, ctx=None)
Conditionally update
target
on PEpe
and return its prior value.- Parameters:
target (NDArray[Any]) – Symmetric array of size
1
containing the destination value.cond (int | float | complex | number) –
cond
is compared to the remotetarget
value. Ifcond
and the remotetarget
are equal, thenvalue
is swapped into thetarget
; otherwise, thetarget
is unchanged.value (int | float | complex | number) – The value to be atomically written to the remote PE.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_fetch_inc(target, pe, ctx=None)
Increment
target
on PEpe
and return its prior value.- Parameters:
- Return type:
- shmem4py.shmem.atomic_fetch_add(target, value, pe, ctx=None)
Add
value
totarget
on PEpe
and return its prior value.- Parameters:
target (NDArray[Any]) – Symmetric array of size
1
containing the destination value.value (int | float | complex | number) – The operand to the atomic fetch-and-add operation.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_fetch_and(target, value, pe, ctx=None)
Perform a bitwise AND on
value
andtarget
at PEpe
and returntarget
’s prior value.- Parameters:
target (NDArray[Any]) – Symmetric array of size
1
containing the destination value.value (int | float | complex | number) – The operand to the bitwise AND operation.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_fetch_or(target, value, pe, ctx=None)
Perform a bitwise OR on
value
andtarget
at PEpe
and returntarget
’s prior value.- Parameters:
target (NDArray[Any]) – Symmetric array of size
1
containing the destination value.value (int | float | complex | number) – The operand to the bitwise OR operation.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_fetch_xor(target, value, pe, ctx=None)
Perform a bitwise XOR on
value
andtarget
at PEpe
and returntarget
’s prior value.- Parameters:
target (NDArray[Any]) – Symmetric array of size
1
containing the destination value.value (int | float | complex | number) – The operand to the bitwise XOR operation.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_fetch_nbi(fetch, source, pe, ctx=None)
Fetch the value of
source
on PEpe
to localfetch
.Nonblocking. The operation is considered complete after a subsequent call to
quiet
.- Parameters:
fetch (NDArray[T]) – Local array of size
1
to be updated.source (NDArray[T]) – Symmetric array of size
1
containing the element that will be fetched.pe (int) – The PE number from which
source
is to be fetched.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_swap_nbi(fetch, target, value, pe, ctx=None)
Write
value
intotarget
on PEpe
and fetch its prior value tofetch
.Nonblocking. The operation is considered complete after a subsequent call to
quiet
.- Parameters:
fetch (NDArray[T]) – Local array of size
1
to be updated.target (NDArray[T]) – Symmetric array of size
1
containing the destination value.value (int | float | complex | number) – The value to be atomically written to the remote PE.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_compare_swap_nbi(fetch, target, cond, value, pe, ctx=None)
Conditionally update
target
and fetch its prior value tofetch
.Nonblocking. The operation is considered complete after a subsequent call to
quiet
.- Parameters:
fetch (NDArray[T]) – Local array of size
1
to be updated.target (NDArray[T]) – Symmetric array of size
1
containing the destination value.cond (int | float | complex | number) –
cond
is compared to the remotetarget
value. Ifcond
and the remotetarget
are equal, thenvalue
is swapped into thetarget
; otherwise, thetarget
is unchanged.value (int | float | complex | number) – The value to be atomically written to the remote PE.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_fetch_inc_nbi(fetch, target, pe, ctx=None)
Increment
target
on PEpe
and fetch its prior value tofetch
.Nonblocking.
The operation is considered complete after a subsequent call to
quiet
.- Parameters:
fetch (NDArray[T]) – Local array of size
1
to be updated.target (NDArray[T]) – Symmetric array of size
1
containing the destination value.pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_fetch_add_nbi(fetch, target, value, pe, ctx=None)
Add
value
totarget
on PEpe
and fetch its prior value tofetch
.Nonblocking. The operation is considered complete after a subsequent call to
quiet
.- Parameters:
fetch (NDArray[T]) – Local array of size
1
to be updated.target (NDArray[T]) – Symmetric array of size
1
containing the destination value.value (int | float | complex | number) – The value to be the atomic fetch-and-add operation.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_fetch_and_nbi(fetch, target, value, pe, ctx=None)
Perform bitwise AND on
target
on PEpe
and fetch its prior value tofetch
.Nonblocking. The operation is considered complete after a subsequent call to
quiet
.- Parameters:
fetch (NDArray[T]) – Local array of size
1
to be updated.target (NDArray[T]) – Symmetric array of size
1
containing the destination value.value (int | float | complex | number) – The operand to the bitwise AND operation.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_fetch_or_nbi(fetch, target, value, pe, ctx=None)
Perform bitwise OR on
target
on PEpe
and fetch its prior value tofetch
.Nonblocking. The operation is considered complete after a subsequent call to
quiet
.- Parameters:
fetch (NDArray[T]) – Local array of size
1
to be updated.target (NDArray[T]) – Symmetric array of size
1
containing the destination value.value (int | float | complex | number) – The operand to the bitwise OR operation.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.atomic_fetch_xor_nbi(fetch, target, value, pe, ctx=None)
Perform bitwise XOR on
target
on PEpe
and fetch its prior value tofetch
.Nonblocking. The operation is considered complete after a subsequent call to
quiet
.- Parameters:
fetch (NDArray[T]) – Local array of size
1
to be updated.target (NDArray[T]) – Symmetric array of size
1
containing the destination value.value (int | float | complex | number) – The operand to the bitwise XOR operation.
pe (int) – The PE number on which
target
is to be updated.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
Signaling Operations
Create a signal data object. |
|
|
Delete a signal data object. |
|
Fetch the signal update on a local data object. |
|
Copy local |
|
Copy local |
|
Signal operations. |
- shmem4py.shmem.del_signal(signal)
Delete a signal data object.
- shmem4py.shmem.signal_fetch(signal)
Fetch the signal update on a local data object.
- shmem4py.shmem.put_signal(target, source, pe, signal, value, sigop, size=None, ctx=None)
Copy local
source
totarget
on PEpe
and update a remote flag to signal completion.- Parameters:
target (NDArray[T]) – The symmetric destination array to be updated on the remote PE.
source (NDArray[T]) – Local array containing the data to be copied.
pe (int) – PE number of the remote PE.
signal (SigAddr) – Symmetric signal object to be updated on the remote PE as a signal.
value (int) – The value that is used for updating the remote
signal
data object.sigop (SIGNAL) – Signal operator that represents the type of update to be performed on the remote
signal
data object.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- shmem4py.shmem.put_signal_nbi(target, source, pe, signal, value, sigop, size=None, ctx=None)
Copy local
source
totarget
on PEpe
and update a remote flag to signal completion. Nonblocking.This routine returns after initiating the operation. The operation is considered complete after a subsequent call to
quiet
.- Parameters:
target (NDArray[T]) – The symmetric destination array to be updated on the remote PE.
source (NDArray[T]) – Local array containing the data to be copied.
pe (int) – PE number of the remote PE.
signal (SigAddr) – Symmetric signal object to be updated on the remote PE as a signal.
value (int) – The value that is used for updating the remote
signal
data object.sigop (SIGNAL) – Signal operator that represents the type of update to be performed on the remote
signal
data object.ctx (Ctx | None) – The context on which to perform the operation. If
None
, the default context is used.
- Return type:
- class shmem4py.shmem.SIGNAL(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Signal operations.
- SET
An update to signal data object is an atomic set operation. It writes an unsigned 64-bit value as a signal into the signal data object on a remote PE as an atomic operation.
- Type:
Collective Operations
Register the arrival of a PE at a barrier, complete updates, wait for others. |
|
|
Register the arrival of a PE at a synchronization point, wait for all others. |
|
Register the arrival of a PE at a synchronization point, wait for others. |
|
Copy the |
|
Concatenate blocks of data from multiple PEs to an array in every PE participating in the collective routine. |
|
Concatenate blocks of data from multiple PEs to an array in every PE participating in the collective routine. |
|
Exchange data elements with all other participating PEs. |
|
Exchange strided data elements with all other participating PEs. |
|
Perform a specified reduction across a set of PEs. |
|
Reduction operation. |
|
Perform a bitwise AND reduction across a set of PEs. |
|
Perform a bitwise OR reduction across a set of PEs. |
|
Perform a bitwise exclusive OR (XOR) reduction across a set of PEs. |
|
Perform a maximum-value reduction across a set of PEs. |
|
Perform a minimum-value reduction across a set of PEs. |
|
Perform a sum reduction across a set of PEs. |
|
Perform a product reduction across a set of PEs. |
- shmem4py.shmem.barrier_all()
Register the arrival of a PE at a barrier, complete updates, wait for others.
This routine blocks the calling PE until all PEs have called
barrier_all
. Prior to synchronizing with other PEs,barrier_all
ensures completion of all previously issued memory stores and remote memory updates issued on the default context.- Return type:
- shmem4py.shmem.sync_all()
Register the arrival of a PE at a synchronization point, wait for all others.
This routine blocks the calling PE until all PEs in the world team have called
sync_all
.- Return type:
- shmem4py.shmem.sync(team=None)
Register the arrival of a PE at a synchronization point, wait for others.
This routine does not return until all other PEs in a given team or active set arrive at this synchronization point.
- shmem4py.shmem.broadcast(target, source, root, size=None, team=None)
Copy the
source
fromroot
totarget
on participating PEs.
- shmem4py.shmem.collect(target, source, size=None, team=None)
Concatenate blocks of data from multiple PEs to an array in every PE participating in the collective routine.
size can vary from PE to PE;
MPI_Allgatherv
equivalent.Performs a collective operation to concatenate
size
data items from thesource
array into thetarget
array.- Parameters:
target (NDArray[T]) – Symmetric destination array large enough to accept the concatenation of the source arrays on all participating PEs.
source (NDArray[T]) – Symmetric source array.
size (int | None) – The number of elements to be communicated.
team (Team | None) – The team over which to perform the operation.
- Return type:
- shmem4py.shmem.fcollect(target, source, size=None, team=None)
Concatenate blocks of data from multiple PEs to an array in every PE participating in the collective routine.
size must be the same value in all participating PEs;
MPI_Allgather
equivalent.- Parameters:
target (NDArray[T]) – Symmetric destination array large enough to accept the concatenation of the source arrays on all participating PEs.
source (NDArray[T]) – Symmetric source array.
size (int | None) – The number of elements to be communicated.
team (Team | None) – The team over which to perform the operation.
- Return type:
- shmem4py.shmem.alltoall(target, source, size=None, team=None)
Exchange data elements with all other participating PEs.
The total size of each PE’s
source
object andtarget
object issize
times the size of an element timesN
, whereN
equals the number of PEs participating in the operation. The source object containsN
blocks of data (where the size of each block is defined bysize
) and each block of data is sent to a different PE.- Parameters:
target – Symmetric destination array large enough to receive the combined total of
size
elements from each PE in the active set.source – Symmetric source array that contains
size
elements of data for each PE in the active set, ordered according to destination PE.size – The number of elements to exchange for each PE.
team – The team over which to perform the operation.
- Return type:
- shmem4py.shmem.alltoalls(target, source, tst=1, sst=1, size=None, team=None)
Exchange strided data elements with all other participating PEs.
- Parameters:
target (NDArray[T]) – Symmetric destination array large enough to receive the combined total of
size
elements from each PE in the active set.source (NDArray[T]) – Symmetric source array that contains
size
elements of data for each PE in the active set, ordered according to destination PE.tst (int) – The stride between consecutive elements of the
target
data object. The stride is scaled by the element size.sst (int) – The stride between consecutive elements of the
source
data object. The stride is scaled by the element size.size (int | None) – The number of elements to exchange for each PE.
team (Team | None) – The team over which to perform the operation.
- Return type:
- shmem4py.shmem.reduce(target, source, op=OP_SUM, size=None, team=None)
Perform a specified reduction across a set of PEs.
- Parameters:
target (NDArray[T]) – Symmetric destination array of length at least
size
elements, where the result of the reduction routine will be stored.source (NDArray[T]) – Symmetric source array of length at least
size
elements, that contains one element for each separate reduction routine.op (OP) – The reduction operation to perform.
size (int | None) – The number of elements to perform the reduction on.
team (Team | None) – The team over which to perform the operation.
- Return type:
- class shmem4py.shmem.OP(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Reduction operation.
- shmem4py.shmem.and_reduce(target, source, size=None, team=None)
Perform a bitwise AND reduction across a set of PEs.
- Parameters:
target (NDArray[T]) – Symmetric destination array of length at least
size
elements, where the result of the reduction routine will be stored.source (NDArray[T]) – Symmetric source array of length at least
size
elements, that contains one element for each separate reduction routine.size (int | None) – The number of elements to perform the reduction on.
team (Team | None) – The team over which to perform the operation.
- Return type:
- shmem4py.shmem.or_reduce(target, source, size=None, team=None)
Perform a bitwise OR reduction across a set of PEs.
- Parameters:
target (NDArray[T]) – Symmetric destination array of length at least
size
elements, where the result of the reduction routine will be stored.source (NDArray[T]) – Symmetric source array of length at least
size
elements, that contains one element for each separate reduction routine.size (int | None) – The number of elements to perform the reduction on.
team (Team | None) – The team over which to perform the operation.
- Return type:
- shmem4py.shmem.xor_reduce(target, source, size=None, team=None)
Perform a bitwise exclusive OR (XOR) reduction across a set of PEs.
- Parameters:
target (NDArray[T]) – Symmetric destination array of length at least
size
elements, where the result of the reduction routine will be stored.source (NDArray[T]) – Symmetric source array of length at least
size
elements, that contains one element for each separate reduction routine.size (int | None) – The number of elements to perform the reduction on.
team (Team | None) – The team over which to perform the operation.
- Return type:
- shmem4py.shmem.max_reduce(target, source, size=None, team=None)
Perform a maximum-value reduction across a set of PEs.
- Parameters:
target (NDArray[T]) – Symmetric destination array of length at least
size
elements, where the result of the reduction routine will be stored.source (NDArray[T]) – Symmetric source array of length at least
size
elements, that contains one element for each separate reduction routine.size (int | None) – The number of elements to perform the reduction on.
team (Team | None) – The team over which to perform the operation.
- Return type:
- shmem4py.shmem.min_reduce(target, source, size=None, team=None)
Perform a minimum-value reduction across a set of PEs.
- Parameters:
target (NDArray[T]) – Symmetric destination array of length at least
size
elements, where the result of the reduction routine will be stored.source (NDArray[T]) – Symmetric source array of length at least
size
elements, that contains one element for each separate reduction routine.size (int | None) – The number of elements to perform the reduction on.
team (Team | None) – The team over which to perform the operation.
- Return type:
- shmem4py.shmem.sum_reduce(target, source, size=None, team=None)
Perform a sum reduction across a set of PEs.
- Parameters:
target (NDArray[T]) – Symmetric destination array of length at least
size
elements, where the result of the reduction routine will be stored.source (NDArray[T]) – Symmetric source array of length at least
size
elements, that contains one element for each separate reduction routine.size (int | None) – The number of elements to perform the reduction on.
team (Team | None) – The team over which to perform the operation.
- Return type:
- shmem4py.shmem.prod_reduce(target, source, size=None, team=None)
Perform a product reduction across a set of PEs.
- Parameters:
target (NDArray[T]) – Symmetric destination array of length at least
size
elements, where the result of the reduction routine will be stored.source (NDArray[T]) – Symmetric source array of length at least
size
elements, that contains one element for each separate reduction routine.size (int | None) – The number of elements to perform the reduction on.
team (Team | None) – The team over which to perform the operation.
- Return type:
Point-To-Point Synchronization
|
Wait until a variable satisfies a condition. |
|
Wait until all variables satisfy a condition. |
|
Wait until any one variable satisfies a condition. |
|
Wait until at least one variable satisfies a condition. |
|
Wait until all variables satisfy the specified conditions. |
|
Wait until any one variable satisfies the specified conditions. |
|
Wait until at least one variable satisfies the specified conditions. |
|
Indicate whether a variable on the local PE meets a condition. |
|
Indicate whether all variables on the local PE meet a condition. |
|
Indicate whether any one variable on the local PE meets a condition. |
|
Indicate whether at least one variable on the local PE meets a condition. |
|
Indicate whether all variables on the local PE meet the specified conditions. |
|
Indicate whether any one variable on the local PE meets its specified condition. |
|
Indicate whether at least one variable on the local PE meets its specified condition. |
|
Wait for a variable on the local PE to change from a signaling operation. |
|
Comparison operator. |
- shmem4py.shmem.wait_until(ivar, cmp, value)
Wait until a variable satisfies a condition.
Blocks until the value
ivar
satisfies the conditionivar cmp value
at the calling PE, wherecmp
is the comparison operator.
- shmem4py.shmem.wait_until_all(ivars, cmp, value, status=None)
Wait until all variables satisfy a condition.
Blocks until all values specified in
ivars
not excluded bystatus
satisfy the conditionivars[i] cmp value
at the calling PE, wherecmp
is the comparison operator.- Parameters:
ivars (NDArray[Any]) – Symmetric array of objects to be compared.
cmp (CMP) – The comparison operator that compares elements of
ivars
withvalue
.value (int | float | complex | number) – The value to be compared with elements of
ivars
.status (Sequence[int] | None) – An optional mask array of length
len(ivars)
indicating which elements ofivars
are excluded from the wait set. Nonzero values exclude the corresponding element from the wait set.
- Return type:
- shmem4py.shmem.wait_until_any(ivars, cmp, value, status=None)
Wait until any one variable satisfies a condition.
Blocks until any one entry in the wait set specified by
ivars
not excluded bystatus
satisfies the conditionivars[i] cmp value
at the calling PE, wherecmp
is the comparison operator.- Parameters:
ivars (NDArray[Any]) – Symmetric array of objects to be compared.
cmp (CMP) – The comparison operator that compares elements of
ivars
withvalue
.value (int | float | complex | number) – The value to be compared with elements of
ivars
.status (Sequence[int] | None) – An optional mask array of length
len(ivars)
indicating which elements ofivars
are excluded from the wait set. Nonzero values exclude the corresponding element from the wait set.
- Returns:
The index of entry
i
ofivars
that satisfies the condition.- Return type:
- shmem4py.shmem.wait_until_some(ivars, cmp, value, status=None)
Wait until at least one variable satisfies a condition.
Blocks until at least one entry in the wait set specified by
ivars
not excluded bystatus
satisfies the conditionivars[i] cmp value
at the calling PE, wherecmp
is the comparison operator.- Parameters:
ivars (NDArray[Any]) – Symmetric array of objects to be compared.
cmp (CMP) – The comparison operator that compares elements of
ivars
withvalue
.value (int | float | complex | number) – The value to be compared with elements of
ivars
.status (Sequence[int] | None) – An optional mask array of length
len(ivars)
indicating which elements ofivars
are excluded from the wait set. Nonzero values exclude the corresponding element from the wait set.
- Returns:
Indices of entries of
ivars
that satisfy the condition.- Return type:
- shmem4py.shmem.wait_until_all_vector(ivars, cmp, values, status=None)
Wait until all variables satisfy the specified conditions.
Blocks until all values specified in
ivars
not excluded bystatus
satisfy the conditionivars[i] cmp values[i]
at the calling PE, wherecmp
is the comparison operator.- Parameters:
ivars (NDArray[Any]) – Symmetric array of objects to be compared.
cmp (CMP) – The comparison operator that compares elements of
ivars
with the elements ofvalues
.values (Sequence[int | float | complex | number]) – Local array containing values to be compared with the respective elements of
ivars
.status (Sequence[int] | None) – An optional mask array of length
len(ivars)
indicating which elements ofivars
are excluded from the wait set. Nonzero values exclude the corresponding element from the wait set.
- Return type:
- shmem4py.shmem.wait_until_any_vector(ivars, cmp, values, status=None)
Wait until any one variable satisfies the specified conditions.
Blocks until any one value specified in
ivars
not excluded bystatus
satisfies the conditionivars[i] cmp values[i]
at the calling PE, wherecmp
is the comparison operator.- Parameters:
ivars (NDArray[Any]) – Symmetric array of objects to be compared.
cmp (CMP) – The comparison operator that compares elements of
ivars
with the elements ofvalues
.values (Sequence[int | float | complex | number]) – Local array containing values to be compared with the respective elements of
ivars
.status (Sequence[int] | None) – An optional mask array of length
len(ivars)
indicating which elements ofivars
are excluded from the wait set. Nonzero values exclude the corresponding element from the wait set.
- Returns:
The index of entry
i
ofivars
that satisfies the condition.- Return type:
- shmem4py.shmem.wait_until_some_vector(ivars, cmp, values, status=None)
Wait until at least one variable satisfies the specified conditions.
Blocks until any one value specified in
ivars
not excluded bystatus
satisfies the conditionivars[i] cmp values[i]
at the calling PE, wherecmp
is the comparison operator.- Parameters:
ivars (NDArray[Any]) – Symmetric array of objects to be compared.
cmp (CMP) – The comparison operator that compares elements of
ivars
with the elements ofvalues
.values (Sequence[int | float | complex | number]) – Local array containing values to be compared with the respective elements of
ivars
.status (Sequence[int] | None) – An optional mask array of length
len(ivars)
indicating which elements ofivars
are excluded from the wait set. Nonzero values exclude the corresponding element from the wait set.
- Returns:
Indices of entries of
ivars
that satisfy the condition.- Return type:
- shmem4py.shmem.test(ivar, cmp, value)
Indicate whether a variable on the local PE meets a condition.
- shmem4py.shmem.test_all(ivars, cmp, value, status=None)
Indicate whether all variables on the local PE meet a condition.
- Parameters:
ivars (NDArray[Any]) – Symmetric array of objects to be tested.
cmp (CMP) – The comparison operator that compares elements of
ivars
withvalue
.value (int | float | complex | number) – The value to be compared with elements of
ivars
.status (Sequence[int] | None) – An optional mask array of length
len(ivars)
indicating which elements ofivars
are excluded from the test set. Nonzero values exclude the corresponding element from the test set.
- Return type:
- shmem4py.shmem.test_any(ivars, cmp, value, status=None)
Indicate whether any one variable on the local PE meets a condition.
- Parameters:
ivars (NDArray[Any]) – Symmetric array of objects to be tested.
cmp (CMP) – The comparison operator that compares elements of
ivars
withvalue
.value (int | float | complex | number) – The value to be compared with elements of
ivars
.status (Sequence[int] | None) – An optional mask array of length
len(ivars)
indicating which elements ofivars
are excluded from the test set. Nonzero values exclude the corresponding element from the test set.
- Returns:
The index of entry
i
ofivars
that satisfies the condition.- Return type:
- shmem4py.shmem.test_some(ivars, cmp, value, status=None)
Indicate whether at least one variable on the local PE meets a condition.
- Parameters:
ivars (NDArray[Any]) – Symmetric array of objects to be tested.
cmp (CMP) – The comparison operator that compares elements of
ivars
withvalue
.value (int | float | complex | number) – The value to be compared with elements of
ivars
.status (Sequence[int] | None) – An optional mask array of length
len(ivars)
indicating which elements ofivars
are excluded from the test set. Nonzero values exclude the corresponding element from the test set.
- Returns:
Indices of entries of
ivars
that satisfy the condition.- Return type:
- shmem4py.shmem.test_all_vector(ivars, cmp, values, status=None)
Indicate whether all variables on the local PE meet the specified conditions.
- Parameters:
ivars (NDArray[Any]) – Symmetric array of objects to be tested.
cmp (CMP) – The comparison operator that compares elements of
ivars
with the elements ofvalues
.values (Sequence[int | float | complex | number]) – Local array containing values to be compared with the respective elements of
ivars
.status (Sequence[int] | None) – An optional mask array of length
len(ivars)
indicating which elements ofivars
are excluded from the test set. Nonzero values exclude the corresponding element from the test set.
- Return type:
- shmem4py.shmem.test_any_vector(ivars, cmp, values, status=None)
Indicate whether any one variable on the local PE meets its specified condition.
- Parameters:
ivars (NDArray[Any]) – Symmetric array of objects to be tested.
cmp (CMP) – The comparison operator that compares elements of
ivars
with the elements ofvalues
.values (Sequence[int | float | complex | number]) – Local array containing values to be compared with the respective elements of
ivars
.status (Sequence[int] | None) – An optional mask array of length
len(ivars)
indicating which elements ofivars
are excluded from the test set. Nonzero values exclude the corresponding element from the test set.
- Returns:
The index of entry
i
ofivars
that satisfies the condition.- Return type:
- shmem4py.shmem.test_some_vector(ivars, cmp, values, status=None)
Indicate whether at least one variable on the local PE meets its specified condition.
- Parameters:
ivars (NDArray[Any]) – Symmetric array of objects to be tested.
cmp (CMP) – The comparison operator that compares elements of
ivars
with the elements ofvalues
.values (Sequence[int | float | complex | number]) – Local array containing values to be compared with the respective elements of
ivars
.status (Sequence[int] | None) – An optional mask array of length
len(ivars)
indicating which elements ofivars
are excluded from the test set. Nonzero values exclude the corresponding element from the test set.
- Returns:
Indices of entries of
ivars
that satisfy the condition.- Return type:
- shmem4py.shmem.signal_wait_until(signal, cmp, value)
Wait for a variable on the local PE to change from a signaling operation.
- Parameters:
- Returns:
The contents of the signal data object,
signal
, at the calling PE that satisfies the wait condition.- Return type:
- class shmem4py.shmem.CMP(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Comparison operator.
Memory Ordering
|
Ensure ordering of delivery of operations on symmetric data objects. |
|
Wait for completion of outstanding operations on symmetric data objects issued by a PE. |
- shmem4py.shmem.fence(ctx=None)
Ensure ordering of delivery of operations on symmetric data objects.
All operations on symmetric data objects issued to a particular PE on the given context prior to the call to
fence
are guaranteed to be delivered before any subsequent operations on symmetric data objects to the same PE.
- shmem4py.shmem.quiet(ctx=None)
Wait for completion of outstanding operations on symmetric data objects issued by a PE.
Ensures completion of all operations on symmetric data objects issued by the calling PE on the given context.
Distributed Locking
|
Create a lock object. |
|
Delete a lock object. |
|
Acquire a mutual exclusion lock after waiting for the lock to be freed. |
|
Acquire a mutual exclusion lock only if it is currently cleared. |
|
- shmem4py.shmem.new_lock()
Create a lock object.
- Return type:
- shmem4py.shmem.del_lock(lock)
Delete a lock object.
- Parameters:
lock (LockHandle) – A lock object to be deleted.
- Return type:
- shmem4py.shmem.set_lock(lock)
Acquire a mutual exclusion lock after waiting for the lock to be freed.
- Parameters:
lock (LockHandle) – Symmetric scalar variable or an array of length
1
.- Return type:
- shmem4py.shmem.test_lock(lock)
Acquire a mutual exclusion lock only if it is currently cleared.
By using this routine, a PE can avoid blocking on a set lock.
- Parameters:
lock (LockHandle) – Symmetric scalar variable or an array of length
1
.- Returns:
Returns
False
if the lock was originally cleared and this call was able to acquire the lock.True
is returned if the lock had been set and the call returned without waiting to set the lock.- Return type:
- shmem4py.shmem.clear_lock(lock)
Release a lock previously set by
set_lock
ortest_lock
.Releases a lock after performing a
quiet
operation on the default context to ensure that all symmetric memory accesses that occurred during the critical region are complete.- Parameters:
lock (LockHandle) – Symmetric scalar variable or an array of length
1
.- Return type:
Distributed Locking (Object-Oriented)
|
Lock object. |
Destroy the lock object. |
|
|
Acquire the lock. |
Release the lock. |
Profiling Control
|
Set the profiling level. |
Typing Support
- shmem4py.shmem.SigAddr = shmem4py.shmem.SigAddr
Signal address.
- shmem4py.shmem.CtxHandle = shmem4py.shmem.CtxHandle
Context handle.
- shmem4py.shmem.TeamHandle = shmem4py.shmem.TeamHandle
Team handle.
- shmem4py.shmem.LockHandle = shmem4py.shmem.LockHandle
Lock handle.