ecoli.analysis.single.blame

class ecoli.analysis.single.blame.DivergingNormalize(transform_log=True, within=None)[source]

Bases: Normalize

class ecoli.analysis.single.blame.SignNormalize(vmin=None, vmax=None, clip=False)[source]

Bases: Normalize

Parameters:
  • vmin (float or None) – Values within the range [vmin, vmax] from the input data will be linearly mapped to [0, 1]. If either vmin or vmax is not provided, they default to the minimum and maximum values of the input, respectively.

  • vmax (float or None) – Values within the range [vmin, vmax] from the input data will be linearly mapped to [0, 1]. If either vmin or vmax is not provided, they default to the minimum and maximum values of the input, respectively.

  • clip (bool, default: False) –

    Determines the behavior for mapping values outside the range [vmin, vmax].

    If clipping is off, values outside the range [vmin, vmax] are also transformed, resulting in values outside [0, 1]. This behavior is usually desirable, as colormaps can mark these under and over values with specific colors.

    If clipping is on, values below vmin are mapped to 0 and values above vmax are mapped to 1. Such values become indistinguishable from regular boundary values, which may cause misinterpretation of the data.

Notes

If vmin == vmax, input data will be mapped to 0.

ecoli.analysis.single.blame.blame_plot(data, topology, bulk_ids, filename='out/ecoli_sim/blame.png', selected_molecules=None, selected_processes=None, highlight_molecules=None, label_values=True, color_normalize='n')[source]

Given data from a simulation with logged updates (e.g. by running from CLI with –log_updates flag set), generates a heatmap where the columns are processes, rows are molecules, and cell colors reflect the average rate of change in a molecule over the whole simulation due to a particular process.

Parameters:
  • data – Data from a logged ecoli simulation.

  • topology – Topology of logged ecoli simulation (e.g. sim.ecoli_experiment.topology).

  • bulk_ids – Array of bulk IDs in correct order (can get from initial_state).

  • filename – The file to save the plot to. To skip writing to file, set this to None.

  • selected_molecules – if not None, restricts to the specified molecules.

  • selected_processes – if not None, restricts to the specified processes.

  • highlight_molecules – A collection of molecules to highlight in red (or None).

  • label_values – Whether to numerically label the heatmap cells with their values.

  • color_normalize – whether to normalize values within (p)rocesses, (m)olecules, or (n)either.

Returns:

matplotlib axes and figure.

ecoli.analysis.single.blame.blame_timeseries(data, topology, bulk_ids, molecules, filename=None, yscale='linear')[source]

Generates timeseries blame plots for the selected bulk molecules assuming that bulk data is an array of counts ordered by bulk_ids and saves to the specified output file. Timeseries blame plots show the change in molecule counts due to each process at each timestep. For convenience, exact count plots are included to the side.

Example usage:

sim = EcoliSim.from_file()
sim.build_ecoli()
sim.run()
data = sim.query()
data = {key: val['agents']['0'] for key, val in data.items()}
store_configs = sim.ecoli_experiment.get_config()
bulk_ids = store_configs['agents']['0']['bulk']['_properties']['metadata']
blame_timeseries(data, sim.topology, bulk_ids
                ['WATER[c]', 'APORNAP-CPLX[c]', 'TRP[c]'],
                'out/ecoli_master/test_blame_timeseries.png',
                yscale="linear")
Parameters:
  • data (dict) – Data from an experiment (for experiments with cell division, ensure that bulk is a top-level field in the sub-dictionaries for each time point)

  • topology (dict) – Experiment topology (used to determine which processes are connected to bulk and how)

  • bulk_ids (list[str]) – List (or array) of bulk molecule names in the order they appear in the structured bulk Numpy array (see Bulk Molecules). Typically retrieved from simulation config metadata.

  • molecules (list[str]) – List of bulk molecule names to plot data for

  • filename (str | None) – Path to save plot to (optional)

  • yscale (str) – See matplotlib.pyplot.yscale()

Returns:

Axes and figure

Return type:

tuple[Axes, Figure]

ecoli.analysis.single.blame.extract_bulk(data, bulk_processes, bulk_ids)[source]

Returns bulk updates in form of the array collected_data with dimensions (n_bulk_mols x n_processes), where n_processes is given by the keys that are shared by bulk_processes and data[‘log_update’]. Shared processes are also returned in order.

ecoli.analysis.single.blame.get_bulk_processes(topology)[source]
ecoli.analysis.single.blame.idx_array_from(dictionary)[source]
ecoli.analysis.single.blame.plot(params, conn, history_sql, config_sql, sim_data_paths, validation_data_paths, outdir, variant_metadata, variant_name)[source]
Parameters:
ecoli.analysis.single.blame.preprocess_data(data, bulk_ids, bulk_processes, molecules)[source]

Prepares raw data for blame-timeseries plot. Returns data in the form time, process, values_array where time is a numpy array of times, process is a list of process names, and values_array is a numpy array of the form (molecule x time x process).

ecoli.analysis.single.blame.reposition_ticks(ax, x='bottom', y='left')[source]
ecoli.analysis.single.blame.sign_str(val)[source]
ecoli.analysis.single.blame.signed_stacked_bar(ax, x, y, bar_labels)[source]

ax: Axes object x: x values (1d array) y: y-values (len(x) columns by # stacked bars rows)

Creates a stacked bar chart in the specified Axes, where y’s with negative values represent bars below y=0, and y’s with positive values represent bars above y=0.