Distributed Backends

Distributed backends for running tasks with multiple processes.

class ionworkspipeline.distributed.DistributedBackend

Abstract base class for distributed execution backends.

This class defines the interface that all distributed backends must implement. Each backend provides a different strategy for executing tasks, from single-process sequential execution to multi-process parallel execution.

Extends: abc.ABC

abstractmethod static setup(task_func: Callable[[T], R], tasks: Iterator[T], num_workers: int | None = None, **kwargs) Generator[R, None, None]

Execute tasks using the backend’s execution strategy.

Parameters

task_funcCallable[[T], R]

Function to execute for each task

tasksIterator[T]

Iterator yielding tasks to be executed

num_workersint, optional

Number of worker processes/threads to use (default: max available)

**kwargs

Additional backend-specific configuration options

Returns

Generator[R, None, None]

Generator yielding results as they complete

abstractmethod static shutdown(*args, **kwargs) None

Shutdown the backend.

class ionworkspipeline.distributed.SingleProcess

Single-process sequential execution backend.

Executes tasks one after another in the same process. This is the simplest backend and is useful for debugging or when parallel execution is not needed.

Extends: ionworkspipeline.distributed.DistributedBackend

static setup(task_func: Callable[[T], R], tasks: Iterator[T], num_workers: int | None = None, **kwargs) Generator[R, None, None]

Execute tasks sequentially in a single process.

Parameters

task_funcCallable[[T], R]

Function to execute for each task

tasksIterator[T]

Iterator yielding tasks to be executed

num_workersNone, optional

Ignored for single-process execution.

**kwargs

Additional configuration options (ignored)

Returns

Generator[R, None, None]

Generator yielding results in the order tasks were provided

static shutdown(*args, **kwargs) None

Shutdown single process backend. No specific shutdown needed.

class ionworkspipeline.distributed.Joblib

Multi-process parallel execution backend using joblib.

Uses joblib’s Parallel and delayed to distribute tasks across multiple worker processes. This backend is good for CPU-intensive tasks that can benefit from true parallelism.

Extends: ionworkspipeline.distributed.DistributedBackend

static setup(task_func: Callable[[T], R], tasks: Iterator[T], num_workers: int | None = None, **kwargs) Generator[R, None, None]

Execute tasks in parallel using joblib.

Parameters

task_funcCallable[[T], R]

Function to execute for each task

tasksIterator[T]

Iterator yielding tasks to be executed

num_workersint, optional

Number of worker processes to use. Default is -1, which means use all available cores.

**kwargs

Additional joblib configuration options (e.g., backend, timeout)

Returns

Generator[R, None, None]

Generator yielding results as they complete

static shutdown(*args, **kwargs) None

Shutdown joblib backend. No specific shutdown needed for joblib.

ionworkspipeline.distributed.get_backend(backend_name: str) DistributedBackend

Factory function to get a distributed backend by name.

Parameters

backend_namestr

Name of the backend to create. Supported values: - “single_process”: SingleProcess backend - “joblib”: Joblib backend

Returns

DistributedBackend

Configured backend instance

Raises

ValueError

If backend_name is not recognized