ecoli.experiments.ecoli_master_sim
Interface for configuring and running single-cell E. coli simulations.
Note
Simulations can be configured to divide through this interface, but
full colony-scale simulations are best run using the
ecoli_engine_process
module for efficient
multiprocessing.
- class ecoli.experiments.ecoli_master_sim.EcoliSim(config)[source]
Bases:
object
Main interface for running single-cell E. coli simulations. Typically instantiated using one of two methods:
Config options can be modified after the creation of an
EcoliSim
object in one of two ways.sim.total_time = 100
sim.config['total_time'] = 100
- Parameters:
config (dict[str, Any]) – Automatically generated from
SimConfig
whenEcoliSim
is instantiated usingfrom_file()
orfrom_cli()
- _retrieve_process_configs(process_configs, processes)[source]
Sets up process configs to be interpreted by
generate_processes_and_steps()
.
- _retrieve_processes(processes, add_processes, exclude_processes, swap_processes)[source]
Retrieve process classes from
process_registry
(processes are registered inecoli/processes/__init__.py
).- Parameters:
processes (dict[str, str]) – Base list of process names to retrieve classes for
add_processes (list[str]) – Additional process names to retrieve classes for
exclude_processes (list[str]) – Process names to not retrieve classes for
swap_processes (dict[str, str]) – Mapping of process names to the names of the processes they should be swapped for. It is assumed that the swapped processes share the same topologies.
- Returns:
Mapping of process names to process classes.
- Return type:
- _retrieve_topology(topology, processes, swap_processes, log_updates)[source]
Retrieves topologies for processes from
topology_registry
.- Parameters:
topology (dict[str, dict[str, tuple[str]]]) – Mapping of process names to user-specified topologies. Will be merged with topology from topology_registry, if exists.
processes (list[str]) – List of process names for which to retrive topologies.
swap_processes (dict[str, str]) – Mapping of process names to the names of processes to swap them for. By default, the new processes are assumed to have the same topology as the processes they replaced. When this is not the case, users can add/modify the original process topology with custom values in
topology
under either the new or the old process name.log_updates (bool) – Whether to emit process updates. Adds topology for
log_update
port.
- Returns:
Mapping of process names to process topologies.
- Return type:
- build_ecoli()[source]
Creates the E. coli composite. MUST be called before calling
run()
.For all processes in
config['processes']
:1. Retrieves process class from
process_registry
, which is populated inecoli/processes/__init__.py
.2. Retrieves process topology from
topology_registry
and merge with user-specified topology fromconfig['topology']
, if applicable3. Retrieves process configs from
config['process_configs']
if present, else indicate that process config should be loaded from pickled simulation data usingget_config_by_name()
Adds spatial environment if
config['spatial_environment']
isTrue
. Spatial environment config options are loaded fromconfig['spatial_environment_config`]
. Seeecoli/composites/ecoli_configs/spatial.json
for an example.Note
When loading from a saved state with a file name of the format
vivecoli_t{save time}
, the simulation seed is automatically set toconfig['seed'] + {save_time}
to preventcreate_unique_indexes()
from generating clashing indices.
- ecoli
Contains the fully instantiated processes, steps, topologies, and flow necessary to run simulation. Generated by
build_ecoli()
and cleared whenrun()
is called to potentially free up memory after division.
- export_json(filename='/home/runner/work/vEcoli/vEcoli/ecoli/composites/ecoli_configs/export.json')[source]
Saves current simulation settings along with git hash and final list of process names as a JSON that can be reloaded using
from_file()
.- Parameters:
filename (str) – Filepath and name for saved JSON (include
.json
).
- static from_cli()[source]
Used to instantiate
EcoliSim
with a config loaded from the command-line arguments parsed bySimConfig
.- Return type:
- static from_file(filepath='/home/runner/work/vEcoli/vEcoli/ecoli/composites/ecoli_configs/default.json')[source]
Used to instantiate
EcoliSim
with a config loaded from the JSON atfilepath
bySimConfig
.- Parameters:
filepath – String filepath of JSON file with config options to apply on top of the options laid out in the default JSON located at the default value for
filepath
.- Return type:
- generated_initial_state
Fully populated initial state for simulation. Generated by
build_ecoli()
and cleared whenrun()
is called to potentially free up memory after division.- Type:
- get_metadata()[source]
Compiles all simulation settings, git hash, and process list into a single dictionary.
- get_output_metadata()[source]
Filters all ports schemas to include only output metadata located at the path
('_properties', 'metadata')
for each schema by invokingextract_metadata()
. Seelistener_schema()
for usage details.This dictionary of output metadata is flattened (see
flatten_dict()
) into columns with prefixMETADATA_PREFIX
and emitted as part of the simulation config by the Parquet emitter. It can be retrieved later usingget_field_metadata()
.
- merge(other)[source]
Combine settings from this EcoliSim with another, overriding current settings with those from the other EcoliSim.
- Parameters:
other (EcoliSim) – Simulation with settings to override current simulation.
- query(query=None)[source]
Query data that was emitted to RAMEmitter (
config['emitter'] == 'timeseries'
). For the Parquet emitter, query sim output with an analysis script run usingrunscripts.analysis
or with ad-hoc DuckDB SQL queries built usingget_dataset_sql()
as a base.- Parameters:
query (list[tuple[str]] | None) – List of tuple-style paths in the simulation state to retrieve emitted values for. Returns all emitted data if
None
.- Returns:
Dictionary of emitted data in one of two forms.
Raw data (if
self.raw_output
): Data is keyed by time (e.g.{0: {'data': ...}, 1: {'data': ...}, ...}
)Timeseries: Data is reorganized to match the structure of the simulation state. Leaf values in the returned dictionary are lists of the simulation state value over time (e.g.
{'data': [..., ..., ...]}
).
- run()[source]
Create and run an EcoliSim experiment.
Warning
Run
build_ecoli()
before callingrun()
!
- save_states(daughter_outdir='')[source]
Runs the simulation while saving the states of specific timesteps to files named
data/vivecoli_t{time}.json
. Invoked byrun()
ifconfig['save'] == True
. State is saved as a JSON that can be reloaded into a simulation as described ininitial_state()
.- Parameters:
daughter_outdir (str) – Location to write JSON files for daughter cell(s). Only used if
config
containsgenerations
key specifying number of generations to simulate. Nextflow chains simulations together by passing saved daughter states to new processes.
- ecoli.experiments.ecoli_master_sim.LIST_KEYS_TO_MERGE = ('save_times', 'add_processes', 'exclude_processes', 'processes', 'engine_process_reports', 'initial_state_overrides')
Special configuration keys that are list values which are concatenated together when they are found in multiple sources (e.g. default JSON and user-specified JSON) instead of being directly overriden.
- class ecoli.experiments.ecoli_master_sim.SimConfig(config=None, parser=None)[source]
Bases:
object
Stores configuration options for a simulation. Has dictionary-like interface (e.g. bracket indexing, get, keys).
- config
Current configuration.
- parser
Argument parser for the command-line interface.
- Parameters:
config (Dict[str, Any] | None) – Configuration options. If not provided, the default configuration is loaded from the file path
default_config_path
.parser (ArgumentParser | None) – Useful for scripts that leverage the inheritance features of the JSON config files but want to have their own CLI args for clarity.
- default_config_path = '/home/runner/work/vEcoli/vEcoli/ecoli/composites/ecoli_configs/default.json'
Path to default JSON configuration file.
- static merge_config_dicts(d1, d2)[source]
Helper function to safely merge two config dictionaries. Config options whose values are lists (e.g.
save_times
,add_processes
, etc.) are handled separately so that the lists from each config are concatenated in the merged output.
- exception ecoli.experiments.ecoli_master_sim.TimeLimitError[source]
Bases:
RuntimeError
Error raised when
fail_at_total_time
is True and simulation reachestotal_time
.
- ecoli.experiments.ecoli_master_sim.extract_metadata(ports_schema, properties=False)[source]
Filters ports schema to contain only a mapping of ports to user-supplied metadata (pulled from path (‘_properties’, ‘metadata’) for each schema). See
listener_schema()
for usage details.
- ecoli.experiments.ecoli_master_sim.get_git_revision_hash()[source]
Returns current Git hash for model repository to include in metadata that is emitted when starting a simulation.
- Return type:
- ecoli.experiments.ecoli_master_sim.get_git_status()[source]
Returns Git status of model repository to include in metadata that is emitted when starting a simulation.
- Return type:
- ecoli.experiments.ecoli_master_sim.parse_key_value_args(args_list)[source]
Parses key-value pairs specified as strings of the form
key=value
via CLI. Seeemitter_arg
option inSimConfig
.
- ecoli.experiments.ecoli_master_sim.prepare_save_state(state)[source]
Prepares simulation state to be saved to a JSON file by pruning unsaveable values and adding necessary metadata. Mutates in-place.
- ecoli.experiments.ecoli_master_sim.report_profiling(stats)[source]
Prints out a summary of profiling statistics when
profile
option isTrue
in the config given toEcoliSim
- Parameters:
stats (Stats) – Profiling statistics.
- Return type:
None