Distributed Backends¶
Distributed backends for running tasks with multiple processes.
- class ionworkspipeline.distributed.DistributedBackend¶
Abstract base class for distributed execution backends.
This class defines the interface that all distributed backends must implement. Each backend provides a different strategy for executing tasks, from single-process sequential execution to multi-process parallel execution.
Extends:
abc.ABC- abstractmethod static setup(task_func: Callable[[T], R], tasks: Iterator[T], num_workers: int | None = None, **kwargs) Generator[R, None, None]¶
Execute tasks using the backend’s execution strategy.
Parameters¶
- task_funcCallable[[T], R]
Function to execute for each task
- tasksIterator[T]
Iterator yielding tasks to be executed
- num_workersint, optional
Number of worker processes/threads to use (default: max available)
- **kwargs
Additional backend-specific configuration options
Returns¶
- Generator[R, None, None]
Generator yielding results as they complete
- abstractmethod static shutdown(*args, **kwargs) None¶
Shutdown the backend.
- class ionworkspipeline.distributed.SingleProcess¶
Single-process sequential execution backend.
Executes tasks one after another in the same process. This is the simplest backend and is useful for debugging or when parallel execution is not needed.
Extends:
ionworkspipeline.distributed.DistributedBackend- static setup(task_func: Callable[[T], R], tasks: Iterator[T], num_workers: int | None = None, **kwargs) Generator[R, None, None]¶
Execute tasks sequentially in a single process.
Parameters¶
- task_funcCallable[[T], R]
Function to execute for each task
- tasksIterator[T]
Iterator yielding tasks to be executed
- num_workersNone, optional
Ignored for single-process execution.
- **kwargs
Additional configuration options (ignored)
Returns¶
- Generator[R, None, None]
Generator yielding results in the order tasks were provided
- static shutdown(*args, **kwargs) None¶
Shutdown single process backend. No specific shutdown needed.
- class ionworkspipeline.distributed.Joblib¶
Multi-process parallel execution backend using joblib.
Uses joblib’s Parallel and delayed to distribute tasks across multiple worker processes. This backend is good for CPU-intensive tasks that can benefit from true parallelism.
Extends:
ionworkspipeline.distributed.DistributedBackend- static setup(task_func: Callable[[T], R], tasks: Iterator[T], num_workers: int | None = None, **kwargs) Generator[R, None, None]¶
Execute tasks in parallel using joblib.
Parameters¶
- task_funcCallable[[T], R]
Function to execute for each task
- tasksIterator[T]
Iterator yielding tasks to be executed
- num_workersint, optional
Number of worker processes to use. Default is -1, which means use all available cores.
- **kwargs
Additional joblib configuration options (e.g., backend, timeout)
Returns¶
- Generator[R, None, None]
Generator yielding results as they complete
- static shutdown(*args, **kwargs) None¶
Shutdown joblib backend. No specific shutdown needed for joblib.
- ionworkspipeline.distributed.get_backend(backend_name: str) DistributedBackend¶
Factory function to get a distributed backend by name.
Parameters¶
- backend_namestr
Name of the backend to create. Supported values: - “single_process”: SingleProcess backend - “joblib”: Joblib backend
Returns¶
- DistributedBackend
Configured backend instance
Raises¶
- ValueError
If backend_name is not recognized