`wholecell.utils.polymerize`

polymerize.py

Polymerizes sequences based on monomer and energy limitations.

Run kernprof -lv wholecell/tests/utils/profile_polymerize.py to get a line profile. It @profile-decorates polymerize().

TODO: - document algorithm/corner cases (should already exist somewhere…)

wholecell.utils.polymerize.buildSequences(base_sequences, indexes, positions, elongation_rates)

wholecell.utils.polymerize.computeMassIncrease(sequences, elongations, monomerMasses)

class wholecell.utils.polymerize.polymerize(sequences, monomerLimits, reactionLimit, randomState, elongation_rates, variable_elongation=False)[source]

Bases: object

Polymerize the given DNA/RNA/protein sequences as far as possible within the given limits.

Parameters:

sequences – ndarray of integer, shape (num_sequences, num_steps), the sequences of needed monomer types, containing PAD_VALUE for all steps after sequence completion.
monomerLimits – ndarray of integer, shape (num_monomers,), the available number of each monomer type.
reactionLimit – max number of reactions (monomers to use); the energy limit.
randomState – random number generator to pick winners in shortages.

Returns:

ndarray of integer, shape (num_sequences,): indicating how far the sequences proceeded,
monomerUsages: ndarray of integer, shape (num_monomers,) counting how: many monomers of each type got used,

nReactions: total number of reactions (monomers used), sequences_limited_elongation: ndarray of bool, shape (num_sequences,),

mask indicating whether the sequences were actually elongated to the max lengths expected from the current step.

Return type:

sequenceElongation

PAD_VALUE = -1

_clamp_elongation_to_sequence_length()[source]

A post-iteration clean-up operation. Restricts the elongation of a sequence to at most its total (unpadded) length.

TODO: explain why we do this here instead of during each iteration

_elongate()[source]: Iteratively elongates sequences up to resource limits.

_elongate_to_limit()[source]: Elongate as far as possible without hitting any resource limitations.

_finalize()[source]: Clean up iteration results.

_finalize_resource_limited_elongations()[source]

_gather_input_dimensions()[source]: Collect information about the size of the inputs.

_gather_sequence_data()[source]: Collect static data about the input sequences.

_prepare_outputs()[source]: Running values that ultimately compose the output of the ‘polymerize’ operation.

_prepare_running_values()[source]: Sets up the variables that will change throughout iteration, including both intermediate calculations and outputs.

_sanitize_inputs()[source]: Enforce array typing, and copy input arrays to prevent side-effects.

_setup()[source]: Extended initialization procedures.

_update_elongation_resource_demands()[source]: After updating the active sequences (initialization and culling), recalculate resource demands for the remaining steps given what sequences remain.

wholecell.utils.polymerize.sum_monomers_reference_implementation(sequenceMonomers, activeSequencesIndexes)

Sum up the total number of monomers of each type needed to continue building the active sequences through currentStep. (This is the Python reference implementation, compiled by Cython.)

Arguments: sequenceMonomers – bool[monomer #, sequence #] indicating whether

a given monomer gets used in a step of a sequence

activeSequencesIndexes – an array of sequences that are still active, i.e.: have not yet run out of source monomers

Result: count[monomer #] indicating how many of each monomer will be needed

by the combined active sequences in the currentStep

wholecell.utils.polymerize

`wholecell.utils.polymerize`