wholecell.utils.parallelization
Parallelization utilities.
- class wholecell.utils.parallelization.ApplyResult(result)[source]
Bases:
object
A substitute for multiprocessing.ApplyResult() to return with apply_async. Will get created after a successful function call so ready() and successful() are always True.
- class wholecell.utils.parallelization.InlinePool[source]
Bases:
object
A substitute for multiprocessing.Pool() that runs the work inline in the current process. This is important because (1) a regular Pool worker cannot construct a nested Pool (even with processes=1) since daemon processes are not allowed to have children, and (2) it’s easier to debug code running in the main process.
- apply_async(func, args=(), kwds=None, callback=None)[source]
Apply the function to the args serially (not asynchronously since only one process available).
- class wholecell.utils.parallelization.NoDaemonContext[source]
Bases:
ForkContext
- Process
alias of
NoDaemonProcess
- class wholecell.utils.parallelization.NoDaemonPool(*args, **kwargs)[source]
Bases:
Pool
A substitute for multiprocessing.Pool() that creates a pool that is not a daemonic process. This allows for nesting pool calls that would otherwise be prevented with an assertion error (AssertionError: daemonic processes are not allowed to have children).
- class wholecell.utils.parallelization.NoDaemonProcess(group=None, target=None, name=None, args=(), kwargs={}, *, daemon=None)[source]
Bases:
Process
- property daemon
- wholecell.utils.parallelization.cpus(requested_num_processes=None)[source]
Return the usable number of worker processes for a multiprocessing Pool, up to requested_num_processes (default = max available), considering SLURM and any other environment-specific limitations. 1 means do all work in-process rather than forking subprocesses.
On SLURM: This reads the environment variable ‘SLURM_CPUS_PER_TASK’ containing the number of CPUs requested per task but since that’s only set if the –cpus-per-task option was specified, this falls back to ‘SLURM_JOB_CPUS_PER_NODE’ containing the number of processors available to the job on this node.
- By default, srun sets:
SLURM_CPUS_ON_NODE=1 SLURM_JOB_CPUS_PER_NODE=1
- srun -p mcovert –cpus-per-task=2:
SLURM_CPUS_PER_TASK=2 SLURM_CPUS_ON_NODE=2 SLURM_JOB_CPUS_PER_NODE=2
- srun –ntasks-per-node=3 –cpus-per-task=4:
SLURM_CPUS_PER_TASK=4 SLURM_CPUS_ON_NODE=12 SLURM_JOB_CPUS_PER_NODE=12
- srun –ntasks-per-node=3:
SLURM_CPUS_ON_NODE=3 SLURM_JOB_CPUS_PER_NODE=3
- Parameters:
requested_num_processes (int | None) – the requested number of worker processes; None or 0 means return the max usable number
- Returns:
- the usable number of worker processes for a Pool, as limited
by the hardware, OS, SLURM, and requested_num_processes.
==> 1 means DO NOT CREATE WORKER SUBPROCESSES.
- Return type:
num_cpus
See also pool().
- wholecell.utils.parallelization.is_macos()[source]
Return True if this is running on macOS.
- Return type:
- wholecell.utils.parallelization.pool(num_processes=None, nestable=False)[source]
Return an InlinePool if cpus(num_processes) == 1, else a multiprocessing Pool(cpus(num_processes)), as suitable for the current runtime environment.
This uses the ‘spawn’ process start method to create a fresh python interpreter process, avoiding threading problems and cross-platform inconsistencies.
nestable can create a pool of non-daemon worker processes that can spawn nested processes and have have nested pools.
See cpus() on figuring the number of usable processes. See InlinePool about why running in-process is important.
- Parameters:
- Return type: