Processes

Note

This document assumes that you are familiar with the basic concepts underlying Vivarium processes as outlined in the vivarium-core documentation. Please read the linked page in full before continuing.

The vEcoli model consists of many sub-models that each represent a biological mechanism. We refer to these sub-models as processes. At their core, processes are Python classes that inherit from either Process or Step. For now, we will ignore the differences between these two base classes and refer to both as processes.

Registration

In order for a process to be recognized by our main simulation runscript (ecoli_master_sim) by name, it must be registered in a few key places. First, a process must be registered in the process_registry in the ecoli/processes/__init__.py file.

Second, its topology must be registered in topology_registry. This is usally accomplished by having the following lines at the top of the process file:

from ecoli.processes.registries import topology_registry
topology_registry.register({process name}, {process topology})

This second condition is not strictly necessary as topologies for processes can be manually supplied at runtime using the topology configuration key (see Configuration). The topologies specified using this configuration option always take precedence over any topologies registered for a given process in the topology registry. The topology registry is mainly a convenience designed to keep all code for a process inside the process file (as oppsed to having the topologies for processes in the configuration JSON files).

Parameters

Nearly all processes in our model require parameters calculated from raw experimental data via the parameter calculator or ParCa in order to function. Refer to build_ecoli() and generate_processes_and_steps() for details on how process parameters are loaded from ParCa output (and elsewhere) to configure the processes.

Partitioned Processes

While Vivarium processes are required to implement the next_update() method, you may notice that many processes in ecoli.processes do not have this method and inherit from the PartitionedProcess base class instead of Process or Step. During sim initialization, generate_processes_and_steps() uses each PartitionedProcess to create two processes that inherit from Step and have the required next_update() method: a Requester and an Evolver. These processes share an initialized PartitionedProcess instance, meaning both have access to all the same parameters that the partitioned process was instantiated with and any changes made to instance variables are seen by both. Refer to Partitioning for more details.

Connecting to Stores

To be integrated into the broader model, all processes are required to implement the ports_schema() method and define a topology dictionary. The vEcoli model includes two convenience features to help with this.

Nearly all processes import the ecoli.processes.registries.topology_registry and register their topologies under their unique string name, allowing _retrieve_topology() to automatically retrieve topologies for each process at runtime (called by build_ecoli()).
The three main types of stores in vEcoli (bulk molecules, unique molecules, and listeners) all have helper functions to concisely generate schemas for use in the ports_schema() methods of processes (see Stores).

Time Steps

Processes that inherit from Process are automatically able to run with a time step that the user can supply using the time_step key in the parameter dictionary. However, most processes in vEcoli inherit from Step and not Process. Instead of running with a certain time step, Steps, by default, are run at the end of every time step where at least one Process ran.

To change this to allow our Steps to run with a time step like a Process, we:

Added a top-level store to hold the global simulation time step at ("timestep",).
Added a top-level store to hold the global time at ("global_time",) with a default value of 0.
Added a store for each process located at ("next_update_time", "process_name") which has a default value of ("timestep",).
Added logic to the next_update() methods (evolve_state() for partitioned processes) to increment ("next_update_time", "process_name") by ("timestep",) every time the Step is run.
Added a GlobalClock process that calculates the smallest difference between the current ("global_time",) and each Step’s ("next_update_time", "process_name"). This process has a custom calculate_timestep() method to tell vivarium-core to only run this process after its internal simulation clock reaches the soonest update time for another process. At that time, this process advances ("global_time",) to match the internal clock. Taken together, these actions guarantee that we never accidentally skip over a Step’s scheduled update time and also that our manual time stepping scheme stays perfectly in sync with vivarium-core’s built-in time stepping.
Added a custom update_condition() method to most Steps which tells vivarium-core to only run a given Step when ("next_update_time", "process_name") is less than or equal to ("global_time",).

This manual time stepping scheme highlights a guiding philosophy of models built with vivarium-core: storing simulation values in stores wherever possible. This is what makes our processes modular while still facilitating communication between processes. For example, say we wanted to dynamically modulate the time step over the course of a simulation. By storing the time step for all the relevant Steps in the same ("timestep",) store, a Process or Step only needs to modify this store for all Steps to register this change. Conversely, say we wanted to have each Step run with its own time step instead of a global time step. We could implement this by simply changing the topologies of each Step to connect to a dedicated time step store ("timestep", "process_name"), unlinking the time steps for each Step.

Note

The above scheme is automatically implemented for processes that inherit from PartitionedProcess when they are used to create Requester and Evolver Steps.