wholecell.utils.polymerize
polymerize.py
Polymerizes sequences based on monomer and energy limitations.
Run kernprof -lv wholecell/tests/utils/profile_polymerize.py to get a line profile. It @profile-decorates polymerize().
TODO: - document algorithm/corner cases (should already exist somewhere…)
- wholecell.utils.polymerize.buildSequences(base_sequences, indexes, positions, elongation_rates)
- wholecell.utils.polymerize.computeMassIncrease(sequences, elongations, monomerMasses)
- class wholecell.utils.polymerize.polymerize(sequences, monomerLimits, reactionLimit, randomState, elongation_rates, variable_elongation=False)[source]
Bases:
object
Polymerize the given DNA/RNA/protein sequences as far as possible within the given limits.
- Parameters:
sequences – ndarray of integer, shape (num_sequences, num_steps), the sequences of needed monomer types, containing PAD_VALUE for all steps after sequence completion.
monomerLimits – ndarray of integer, shape (num_monomers,), the available number of each monomer type.
reactionLimit – max number of reactions (monomers to use); the energy limit.
randomState – random number generator to pick winners in shortages.
- Returns:
- ndarray of integer, shape (num_sequences,)
indicating how far the sequences proceeded,
- monomerUsages: ndarray of integer, shape (num_monomers,) counting how
many monomers of each type got used,
nReactions: total number of reactions (monomers used), sequences_limited_elongation: ndarray of bool, shape (num_sequences,),
mask indicating whether the sequences were actually elongated to the max lengths expected from the current step.
- Return type:
sequenceElongation
- PAD_VALUE = -1
- _clamp_elongation_to_sequence_length()[source]
A post-iteration clean-up operation. Restricts the elongation of a sequence to at most its total (unpadded) length.
TODO: explain why we do this here instead of during each iteration
- _prepare_outputs()[source]
Running values that ultimately compose the output of the ‘polymerize’ operation.
- wholecell.utils.polymerize.sum_monomers_reference_implementation(sequenceMonomers, activeSequencesIndexes)
Sum up the total number of monomers of each type needed to continue building the active sequences through currentStep. (This is the Python reference implementation, compiled by Cython.)
Arguments: sequenceMonomers – bool[monomer #, sequence #] indicating whether
a given monomer gets used in a step of a sequence
- activeSequencesIndexes – an array of sequences that are still active, i.e.
have not yet run out of source monomers
Result: count[monomer #] indicating how many of each monomer will be needed
by the combined active sequences in the currentStep