cubie.memory.cupy_emm

CuPy async/sync memory pool External Memory Manager plugin for Numba.

This module provides Numba External Memory Manager (EMM) plugins that integrate CuPy’s memory pools for GPU memory allocation. It supports both synchronous and asynchronous memory pools and provides stream-ordered allocations when using the async allocator.

Classes

CuPyAsyncNumbaManager(context)

Numba EMM plugin using CuPy MemoryAsyncPool for allocation and freeing.

CuPyNumbaManager(context)

Base Numba EMM plugin for using CuPy memory pools to allocate.

CuPySyncNumbaManager(context)

Numba EMM plugin using CuPy MemoryPool for allocation and freeing.

current_cupy_stream(nb_stream)

Context manager to override CuPy's current stream with a Numba stream.

class cubie.memory.cupy_emm.CuPyAsyncNumbaManager(context)[source]

Bases: CuPyNumbaManager

Numba EMM plugin using CuPy MemoryAsyncPool for allocation and freeing.

Parameters:

context (numba.cuda.cudadrv.driver.Context) – CUDA context for memory management.

Notes

Uses CuPy’s asynchronous memory pool which provides stream-ordered memory operations.

initialize()[source]

Initialize the async memory pool.

memalloc(nbytes)[source]

Allocate memory from the async CuPy pool.

Parameters:

nbytes (int) – Number of bytes to allocate.

Returns:

Numba memory pointer wrapping the CuPy allocation.

Return type:

MemoryPointer

class cubie.memory.cupy_emm.CuPyNumbaManager(context)[source]

Bases: FakeGetIpcHandleMixin, FakeHostOnlyCUDAManager

Base Numba EMM plugin for using CuPy memory pools to allocate.

Parameters:

context (numba.cuda.cudadrv.driver.Context) – CUDA context for memory management.

is_cupy

Flag indicating this is a CuPy-based memory manager.

Type:

bool

Notes

Drawn from the tutorial example at: https://github.com/numba/nvidia-cuda-tutorial/blob/main/session-5/examples/cupy_emm_plugin.py

Extended to handle passing numba-generated streams as CuPy external streams, such that the allocations are stream-ordered when using the async allocator.

_make_finalizer(cp_mp, nbytes)[source]

Create a finalizer function for memory cleanup.

Parameters:
  • cp_mp (cupy memory pool allocation) – CuPy memory pool allocation to be cleaned up.

  • nbytes (int) – Number of bytes in the allocation.

Returns:

Finalizer function that removes the allocation reference.

Return type:

callable

defer_cleanup()[source]

Context manager for deferring memory cleanup operations.

Yields:

None

Notes

This doesn’t actually defer returning memory back to the pool, but returning memory to the pool will not interrupt async operations like an actual cudaFree / cuMemFree would.

get_memory_info()[source]

Get memory information from the CuPy memory pool.

Returns:

Object containing free and total memory in bytes from the pool.

Return type:

MemoryInfo

Notes

Returns information from the CuPy memory pool, not the whole device.

initialize()[source]

Initialize the memory manager.

property interface_version

Get the EMM interface version.

Returns:

Interface version number.

Return type:

int

memalloc(nbytes)[source]

Allocate memory from the CuPy pool.

Parameters:

nbytes (int) – Number of bytes to allocate.

Returns:

Numba memory pointer wrapping the CuPy allocation.

Return type:

MemoryPointer

reset(stream=None)[source]

Free all blocks with optional stream for async operations.

Parameters:

stream (cupy.cuda.Stream or None, optional) – Stream for async operations. If None, operates synchronously.

Notes

This is called without a stream argument when the context is reset. To run the operation in one stream, call this function by itself using cuda.current_context().memory_manager.reset(stream)

class cubie.memory.cupy_emm.CuPySyncNumbaManager(context)[source]

Bases: CuPyNumbaManager

Numba EMM plugin using CuPy MemoryPool for allocation and freeing.

Parameters:

context (numba.cuda.cudadrv.driver.Context) – CUDA context for memory management.

Notes

Uses CuPy’s synchronous memory pool which provides standard memory operations.

initialize()[source]

Initialize the sync memory pool.

memalloc(nbytes)[source]

Allocate memory from the sync CuPy pool.

Parameters:

nbytes (int) – Number of bytes to allocate.

Returns:

Numba memory pointer wrapping the CuPy allocation.

Return type:

MemoryPointer

cubie.memory.cupy_emm._numba_stream_ptr(nb_stream)[source]

Extract CUstream pointer from a numba.cuda.cudadrv.driver.Stream.

Parameters:

nb_stream (numba.cuda.cudadrv.driver.Stream or None) – Numba stream object to extract pointer from.

Returns:

CUstream pointer as integer, or None if extraction fails.

Return type:

int or None

Notes

Tries common layouts across Numba versions to maintain compatibility.

class cubie.memory.cupy_emm.current_cupy_stream(nb_stream)[source]

Bases: object

Context manager to override CuPy’s current stream with a Numba stream.

Parameters:

nb_stream (numba.cuda.cudadrv.driver.Stream) – Numba stream to use as CuPy’s current stream.

nb_stream

The Numba stream being used.

Type:

numba.cuda.cudadrv.driver.Stream

cupy_ext_stream

CuPy external stream wrapper around the Numba stream.

Type:

cupy.cuda.ExternalStream or None

Notes

This context manager only has effect when the current CUDA memory manager is a CuPy-based manager. Otherwise, it acts as a no-op context manager.