`wholecell.utils.parallelization`

Parallelization utilities.

class wholecell.utils.parallelization.ApplyResult(result)[source]

Bases: object

A substitute for multiprocessing.ApplyResult() to return with apply_async. Will get created after a successful function call so ready() and successful() are always True.

get(timeout=None)[source]

ready()[source]

successful()[source]

wait(timeout=None)[source]

class wholecell.utils.parallelization.InlinePool[source]

Bases: object

A substitute for multiprocessing.Pool() that runs the work inline in the current process. This is important because (1) a regular Pool worker cannot construct a nested Pool (even with processes=1) since daemon processes are not allowed to have children, and (2) it’s easier to debug code running in the main process.

apply_async(func, args=(), kwds=None, callback=None)[source]

Apply the function to the args serially (not asynchronously since only one process available).

Parameters:

func (Callable[[...], Any])
args (Iterable[Any])
kwds (dict[str, Any] | None)
callback (Callable[[...], None] | None)

Return type:

ApplyResult

close()[source]

join()[source]

map(func, iterable, chunksize=None)[source]

Map the function over the iterable.

Parameters:

func (Callable[[...], Any])
iterable (Iterable[Any])
chunksize (int | None)

Return type:

list

terminate()[source]

class wholecell.utils.parallelization.NoDaemonContext[source]

Bases: ForkContext

Process: alias of NoDaemonProcess

class wholecell.utils.parallelization.NoDaemonPool(*args, **kwargs)[source]

Bases: Pool

A substitute for multiprocessing.Pool() that creates a pool that is not a daemonic process. This allows for nesting pool calls that would otherwise be prevented with an assertion error (AssertionError: daemonic processes are not allowed to have children).

class wholecell.utils.parallelization.NoDaemonProcess(group=None, target=None, name=None, args=(), kwargs={}, *, daemon=None)[source]

Bases: Process

property daemon

wholecell.utils.parallelization.cpus(requested_num_processes=None)[source]

Return the usable number of worker processes for a multiprocessing Pool, up to requested_num_processes (default = max available), considering SLURM and any other environment-specific limitations. 1 means do all work in-process rather than forking subprocesses.

On SLURM: This reads the environment variable ‘SLURM_CPUS_PER_TASK’ containing the number of CPUs requested per task but since that’s only set if the –cpus-per-task option was specified, this falls back to ‘SLURM_JOB_CPUS_PER_NODE’ containing the number of processors available to the job on this node.

By default, srun sets:: SLURM_CPUS_ON_NODE=1 SLURM_JOB_CPUS_PER_NODE=1
srun -p mcovert –cpus-per-task=2:: SLURM_CPUS_PER_TASK=2 SLURM_CPUS_ON_NODE=2 SLURM_JOB_CPUS_PER_NODE=2
srun –ntasks-per-node=3 –cpus-per-task=4:: SLURM_CPUS_PER_TASK=4 SLURM_CPUS_ON_NODE=12 SLURM_JOB_CPUS_PER_NODE=12
srun –ntasks-per-node=3:: SLURM_CPUS_ON_NODE=3 SLURM_JOB_CPUS_PER_NODE=3

Parameters:

requested_num_processes (int | None) – the requested number of worker processes; None or 0 means return the max usable number

Returns:

the usable number of worker processes for a Pool, as limited

by the hardware, OS, SLURM, and requested_num_processes.

==> 1 means DO NOT CREATE WORKER SUBPROCESSES.

Return type:

num_cpus

wholecell.utils.parallelization

`wholecell.utils.parallelization`