`ecoli.library.schema`

Simulation Helper Functions

This is a collection of helper functions used thoughout our code base.

ecoli.library.schema.UNIQUE_DIVIDERS

A mapping of unique molecules to the names of their divider functions ars they are registered in the divider_registry in ecoli/__init__.py

class ecoli.library.schema.UniqueNumpyUpdater[source]

Bases: object

Updates that set attributes of currently active unique molecules must be applied before any updates that add or delete molecules. If this is not enforced, in a single timestep, an update might delete a molecule and allow a subsequent update to add a new molecule in the same row. Then, an update that intends to modify an attribute of the original molecule in that row will actually corrupt the data for the new molecule.

To fix this, this unique molecule updater is a bound method with access to instance attributes that allow it to accumulate updates until given the signal to apply the accumulated updates in the proper order. The signal to apply these updates is given by a special process ( ecoli.processes.unique_update.UniqueUpdate) that is automatically added to the simulation by ecoli.composites.ecoli_master.Ecoli.generate_processes_and_steps()

Sets up instance attributes to accumulate updates.

add_updates: List of updates that add unique molecules

set_updates: List of updates that modify existing unique molecules

delete_updates: List of updates that delete unique molecules

updater(current, update)[source]

Accumulates updates in instance attributes until given signal to apply all updates in the following order: set, add, delete

Parameters:

current (ndarray) – Structured Numpy array for a given unique molecule
update (Dict[str, Any]) –
Dictionary of updates to apply that can contain any combination of the following keys:
- set: List of dictionaries
  Each key is an attribute of the given unique molecule and each value is an array. Each array contains the new attribute values for all active unique molecules in a givne timestep.
- add: List of dictionaries
  Each key is an attribute of the given unique moleucle and each value is an array. The nth element of each array is the value for the corresponding attribute for the nth unique molecule to be added.
- delete: List-like
  List of active molecule indices to delete. Note that current may have rows that are marked as inactive, so deleting the 10th active molecule may not equate to deleting the value in the 10th row of current.
- update: Boolean
  Special key that should only be included in the update of ecoli.processes.unique_update.UniqueUpdate. Tells updater to apply all cached updates (e.g. at the end of an “execution layer”; see Partitioning).

Returns:

Updated unique molecule structured Numpy array.

Return type:

ndarray

ecoli.library.schema.array_from(d)[source]

Makes a Numpy array from dictionary values.

Parameters:: d (dict) – Dictionary whose values are to be converted
Returns:: Array of all values in d.
Return type:: ndarray

ecoli.library.schema.attrs(states, attributes)[source]

Helper function to pull out arrays for unique molecule attributes

Parameters:

states (ndarray) – Structured Numpy array for all unique molecules of a given type (e.g. RNA, active RNAP, etc.)
attributes (List[str]) – List of field names (attributes) whose data should be retrieved for all active unique molecules in states

Returns:

List of arrays, one for each attribute. nth entry in each array corresponds to the value of that attribute for the nth active unique molecule in states

Return type:

List[ndarray]

ecoli.library.schema.bulk_name_to_idx(names, bulk_names)[source]

Primarily used to retrieve indices for groups of bulk molecules (e.g. NTPs) in the first run of a process and cache for future runs

Parameters:

names (str | List | ndarray) – List or array of things to find. Can also be single string.
bulk_names (List | ndarray) – List of array of things to search

Returns:

Index or indices such that bulk_names[indices] == names

Return type:

int | ndarray

ecoli.library.schema.bulk_numpy_updater(current, update)[source]

Updater function for bulk molecule structured array.

Parameters:

current (ndarray) – Bulk molecule structured array
update (List[Tuple[int | ndarray, int | ndarray]]) – List of tuples (mol_idx, add_val), where mol_idx is the index (or array of indices) for the molecule(s) to be updated and add_val is the count (or array of counts) to be added to the current count(s) for the specified molecule(s).

Returns:

Updated bulk molecule structured array

Return type:

ndarray

ecoli.library.schema.counts(states, idx)[source]

Helper function to pull out counts at given indices.

Parameters:

states (ndarray) – Either a Numpy structured array with a ‘count’ field or a 1D Numpy array of counts.
idx (int | ndarray) – Indices for the counts of interest.

Returns:

Counts of molecules at specified indices (copy so can be safely mutated)

Return type:

ndarray

ecoli.library.schema.create_unique_indexes(n_indexes, random_state)[source]

Creates a list of unique indexes by making them random.

Parameters:

n_indexes (int) – Number of indexes to generate.
random_state (RandomState) – PRNG.

Returns:

List of indexes. Each index is a string representing a number in the range \([0, 2^{63})\).

Return type:

List[int]

ecoli.library.schema.divide_RNAs_by_domain(values, state)[source]

Divider function for RNA unique molecules. Ensures that incomplete transcripts are divided in accordance with how active RNAPs are divided (which themselves depend on how chromosome domains are divided).

Parameters:

values (ndarray) – Structured Numpy array of RNA unique molecule state
state (Dict[str, Any]) – View of relevant unique molecule states according to the topology under the RNAs key in ecoli.library.schema.UNIQUE_DIVIDERS.

Returns:

List of two structured Numpy arrays, each containing the RNA unique molecule state of a daughter cell.

Return type:

tuple[ndarray, ndarray]

ecoli.library.schema.divide_binomial(state)[source]

Binomial Divider

Parameters:

state (float) – The value to divide.
config – Must contain a seed key with an integer seed. This seed will be added to int(state) to seed a random number generator used to calculate the binomial.

Returns:

The divided values.

Return type:

List[float]

ecoli.library.schema.divide_bulk(state)[source]

Divider function for bulk molecules. Automatically added to bulk molecule ports schemas by ecoli.library.schema.numpy_schema() when name == 'bulk'. Uses binomial distribution with p=0.5 to randomly partition counts.

Parameters:: state (ndarray) – Structured Numpy array of bulk molecule data.
Returns:: List of two structured Numpy arrays, each representing the bulk molecule state of a daughter cell.
Return type:: tuple[ndarray, ndarray]

ecoli.library.schema.divide_by_domain(values, state)[source]

Divider function for unique molecules that are attached to the chromsome. Ensures that these molecules are divided in accordance with the way that chromosome domains are divided.

Parameters:

values (ndarray) – Structured Numpy array of unique molecule state
state (Dict[str, Any]) – View of full_chromosome and chromosome_domain state as configured under any of the unique molecules with a divider of by_domain in ecoli.library.schema.UNIQUE_DIVIDERS.

Returns:

List of two structured Numpy arrays, each containing the unique molecule state of a daughter cell.

Return type:

tuple[ndarray, ndarray]

ecoli.library.schema.divide_domains(state)[source]

Divider function for chromosome domains. Ensures that all chromosome domains associated with a full chromosome go to the same daughter cell that the full chromosome does.

Parameters:: state (ndarray) – Structured Numpy array of chromosome domain unique molecule state.
Returns:: List of two structured Numpy arrays, each containing the chromosome domain unique molecule state for a daughter cell.
Return type:: tuple[ndarray, ndarray]

ecoli.library.schema.divide_ribosomes_by_RNA(values, state)[source]

Divider function for active ribosome unique molecules. Automatically added to ports schema by ecoli.library.schema.numpy_schema() when name == 'active_ribosome'. Ensures that ribosomes are divided the same way that their associated mRNAs are.

Parameters:

values (ndarray) – Structured Numpy array of active ribosome unique molecule state
state (Dict[str, Any]) – View into relevant unique molecule states according to the topology defined under the active_ribosome key in ecoli.library.schema.UNIQUE_DIVIDERS.

Returns:

List of two structured Numpy arrays, each containing the active ribosome unique molecule state of a daughter cell

Return type:

tuple[ndarray, ndarray]

ecoli.library.schema.divide_set_none(values)[source]: Divider function that sets both daughter cell states to None.

ecoli.library.schema.empty_dict_divider(values)[source]: Divider function that sets both daughter cell states to empty dicts.

ecoli.library.schema.flatten(l)[source]

Flattens a nested list into a single list.

Parameters:: l (List[List[Any]]) – Nested list to flatten.
Return type:: List[Any]

ecoli.library.schema.follow_domain_tree(domain, domain_index, child_domains, place_holder)[source]

Recursive function that returns all the descendents of a single node in the domain tree, including itself.

Parameters:

domain (int) – Domain index to find all descendents for
domain_index (ndarray) – Array of all domain indices
child_domains (ndarray) – Array of child domains for each index in domain_index
place_holder (int) – Placeholder domain index (e.g. used in child_domains for domain indices that do not have child domains)

Return type:

List[int]

class ecoli.library.schema.get_bulk_counts[source]

Bases: Serializer

Serializer for bulk molecules that saves counts without IDs or masses.

serialize()[source]

Parameters:: bulk (ndarray) – Numpy structured array with a count field
Returns:: Contiguous (required by orjson) array of bulk molecule counts
Return type:: ndarray

ecoli.library.schema.get_descendent_domains(root_domains, domain_index, child_domains, place_holder)[source]

Returns an array of domain indexes that are descendents of the indexes listed in root_domains, including the indexes in root_domains themselves.

Parameters:

root_domains – List of domains to get descendents of
domain_index – Array of all domain indices for chromosome domains
child_domains – Array of child domains for each index in domain_index
place_holder – Placeholder domain index (e.g. used in child_domains for domain indices that do not have any child domains)

ecoli.library.schema.get_free_indices(result, n_objects)[source]

Find inactive rows for new molecules and expand array if needed

Parameters:

result (ndarray) – Structured Numpy array for all unique molecules of a given type (e.g. RNA, active RNAP, etc.)
n_objects (int) – Number of new unique molecules to be added

Returns:

A tuple (result, free_idx). result is the same as the input argument unless n_objects is greater than the number of inactive rows in result. In this case, result is grown by at least 10% by concatenating new rows (all zeros). free_idx is an array of size n_objects that contains the indices of rows in result that are inactive (_entryState field is 0).

Return type:

Tuple[ndarray, ndarray]

class ecoli.library.schema.get_unique_fields[source]

Bases: Serializer

Serializer for unique molecules.

serialize()[source]

Parameters:: unique (ndarray) – Numpy structured array of attributes for one unique molecule
Returns:: List of contiguous (required by orjson) arrays, one for each attribute
Return type:: ndarray

ecoli.library.schema.listener_schema(elements)[source]

Helper function that can be used in ports_schema to create generic schema for a collection of listeners.

Parameters:: elements (Dict[str, Any]) – Dictionary where keys are listener names and values are the defaults for each listener. Alternatively, if the value is a tuple, assume that the first element is the default and the second is metadata that will be emitted at the beginning of a simulation when emitter is set to database and emit_config is set to True (see ecoli.experiments.ecoli_master_sim). This metadata can then be retrieved later to aid in interpreting listener values (see vivarium.core.emitter.data_from_database() for sample code to query experiment configuration collection). As an example, this metadata might be an array of molecule names for a listener whose emits are arrays of counts, where the nth molecule name in the metadata corresponds to the nth value in the counts that are emitted.
Returns:: Ports schemas for all listeners in elements.
Return type:: Dict[str, Dict[str, Any]]

ecoli.library.schema.not_a_process(value)[source]: Returns True if not a vivarium.core.process.Process instance.

ecoli.library.schema.numpy_schema(name, emit=True)[source]

Helper function used in ports schemas for bulk and unique molecules

Parameters:

name (str) – bulk for bulk molecules or one of the keys in UNIQUE_DIVIDERS for unique molecules
emit (bool) – True if should be emitted (default)

Returns:

Fully configured ports schema for molecules of type name

Return type:

Dict[str, Any]

ecoli.library.schema.remove_properties(schema, properties)[source]

Helper function to recursively remove certain properties from a ports schema.

Parameters:

schema (Dict[str, Any]) – Ports schema to remove properties from
properties (List[str]) – List of properties to remove

Returns:

Ports schema with all properties in properties recursively removed.

Return type:

Dict[str, Any]

ecoli.library.schema

Simulation Helper Functions

`ecoli.library.schema`