reconstruction.ecoli.dataclasses.process.transcription
SimulationData for transcription process
TODO: add mapping of tRNA to charged tRNA if allowing more than one modified form of tRNA and separate mappings for tRNA and charged tRNA to AA TODO: handle ppGpp and DksA-ppGpp regulation separately
- class reconstruction.ecoli.dataclasses.process.transcription.Transcription(raw_data, sim_data)[source]
Bases:
object
SimulationData for the transcription process
- _apply_rnaseq_correction()[source]
Applies correction to RNAseq data for shorter genes as required when operon structure is included in the model.
- _build_attenuation(raw_data, sim_data)[source]
Load fold changes related to transcriptional attenuation.
- _build_charged_trna(raw_data, sim_data)[source]
Loads information and creates data structures necessary for charging of tRNA
Note
Requires self.rna_data so can’t be built in translation even if some data structures would be more appropriate there.
- _build_cistron_data(raw_data, sim_data)[source]
Build cistron-associated simulation data from raw data. Cistrons are sections of RNAs that encode for a specific polypeptide. A single RNA molecule may contain one or more cistrons.
- _build_mature_rna_data(raw_data, sim_data)[source]
Build mature RNA-associated simulation data from raw data.
- _build_new_gene_data(raw_data, sim_data)[source]
Load baseline values for new gene expression in all simulations.
- _build_oric_terc_coordinates(raw_data, sim_data)[source]
Builds coordinates of oriC and terC that are used when calculating genomic positions of cistrons and RNAs relative to the origin
- _build_ppgpp_regulation(raw_data, sim_data)[source]
Determine which genes are regulated by ppGpp and store the fold change in expression associated with each gene.
- Attributes set:
ppgpp_regulated_genes (ndarray[str]): cistron ID of regulated genes ppgpp_fold_changes (ndarray[float]): log2 fold change for each gene
in ppgpp_regulated_genes
- _ppgpp_growth_parameters: parameters for interpolate.splev
to estimate growth rate from ppGpp concentration
- _build_transcription(raw_data, sim_data)[source]
Build transcription-associated simulation data from raw data.
- _get_relative_coordinates(coordinates)[source]
Returns the genomic coordinates of a given gene coordinate relative to the origin of replication.
- _solve_ppgpp_km(raw_data, sim_data)[source]
Solves for general expression rates for bound and free RNAP and a KM for ppGpp to RNAP based on global cellular measurements. Parameters are solved for at different doubling times using a gradient descent method to minimize the difference in expression of stable RNA compared to the measured RNA in a cell. Assumes a Hill coefficient of 2 for ppGpp binding to RNAP.
- Attributes set:
- _fit_ppgpp_fc (float): log2 fold change in stable RNA expression
from a fast doubling time to a slow doubling time based on the rates of bound and free RNAP expression found
- _ppgpp_km_squared (float): squared and unitless KM value for
to limit computation needed for fraction bound
- ppgpp_km (float with mol / volume units): KM for ppGpp binding
to RNAP
- adjust_polymerizing_ppgpp_expression(sim_data)[source]
Adjust ppGpp expression based on fit for ribosome and RNAP physiological constraints using least squares fit for 3 conditions with different growth rates/ppGpp.
- Modifies attributes:
- exp_ppgpp (ndarray[float]): expression for each gene when RNAP
is bound to ppGpp, adjusted for necessary RNAP and ribosome expression, normalized to 1
- exp_free (ndarray[float]): expression for each gene when RNAP
is not bound to ppGpp, adjusted for necessary RNAP and ribosome expression, normalized to 1
Note
See docs/processes/transcription_regulation.pdf for a description of the math used in this section.
- adjust_ppgpp_expression_for_tfs(sim_data)[source]
Adjusts ppGpp regulated expression to get expression with and without ppGpp regulation to match in basal condition and taking into account the effect transcription factors will have.
- charging_stoich_matrix()[source]
Creates stoich matrix from i, j, v arrays
Returns 2D array with rows of metabolites for each tRNA charging reaction on the column
- cistron_id_to_rna_indexes(cistron_id)[source]
Returns the indexes of transcription units containing the given RNA cistron given the ID of the cistron.
- expression_from_ppgpp(ppgpp)[source]
Calculates the expression of each gene at a given concentration of ppGpp.
- Parameters:
ppgpp (float with or without mol / volume units) – concentration of ppGpp, if unitless, should represent the concentration of PPGPP_CONC_UNITS
- Returns:
normalized expression for each gene
- Return type:
ndarray[float]
- fit_rna_expression(cistron_expression)[source]
Calculates the expression of RNA transcription units that best fits the given expression levels of cistrons using nonnegative least squares.
- fit_trna_expression(tRNA_cistron_expression)[source]
Calculates the expression of tRNA transcription units that best fits the given expression levels of tRNA cistrons using nonnegative least squares.
- fraction_rnap_bound_ppgpp(ppgpp)[source]
Calculates the fraction of RNAP expected to be bound to ppGpp at a given concentration of ppGpp.
- Parameters:
ppgpp (float with or without mol / volume units) – concentration of ppGpp, if unitless, should represent the concentration of PPGPP_CONC_UNITS
- Returns:
fraction of RNAP that will be bound to ppGpp
- Return type:
- get_attenuation_stop_probabilities(trna_conc)[source]
Calculate the probability of a transcript stopping early due to attenuation.
- get_rna_fractions(ppgpp)[source]
Calculates expected RNA subgroup mass fractions based on ppGpp concentration. If ppGpp expression has not been set yet, uses default measured fractions.
- rna_id_to_cistron_indexes(rna_id)[source]
Returns the indexes of cistrons that constitute the given transcription unit given the ID of the RNA transcription unit.
- set_ppgpp_expression(sim_data)[source]
Called during the parca to determine expression of each transcription unit for ppGpp bound and free RNAP.
- Attributes set:
- exp_ppgpp (ndarray[float]): expression for each TU when RNAP is
bound to ppGpp
- exp_free (ndarray[float]): expression for each TU when RNAP is not
bound to ppGpp
- synth_prob_from_ppgpp(ppgpp, copy_number, balanced_rRNA_prob=True)[source]
Calculates the synthesis probability of each gene at a given concentration of ppGpp.
- Parameters:
ppgpp (float with mol / volume units) – concentration of ppGpp
copy_number (Callable[float, int]) – function that gives the expected copy number given a doubling time and gene replication coordinate
balanced_rRNA_prob (bool) – if True, set synthesis probabilities of rRNA promoters equal to one another
- Returns
prob (ndarray[float]): normalized synthesis probability for each gene factor (ndarray[float]): factor to adjust expression to probability for each gene
Note
copy_number should be sim_data.process.replication.get_average_copy_number but saving the function handle as a class attribute prevents pickling of sim_data without additional handling