runscripts.analysis
- runscripts.analysis.ANALYSIS_TYPES = {'multidaughter': ['experiment_id', 'variant', 'lineage_seed', 'generation'], 'multiexperiment': [], 'multigeneration': ['experiment_id', 'variant', 'lineage_seed'], 'multiseed': ['experiment_id', 'variant'], 'multivariant': ['experiment_id'], 'parca': [], 'single': ['experiment_id', 'variant', 'lineage_seed', 'generation', 'agent_id']}
Mapping of all possible analysis types to the combination of identifiers that must be unique for each subset of the data given to that analysis type as input.
- runscripts.analysis.FILTERS = {'agent_id': <class 'str'>, 'experiment_id': <class 'str'>, 'generation': <class 'int'>, 'lineage_seed': <class 'int'>, 'variant': <class 'int'>}
Mapping of data filters to data type.
- runscripts.analysis.build_duckdb_filter(config)[source]
Build a DuckDB WHERE clause from config filters.
- runscripts.analysis.build_query_strings(analysis_type, duckdb_filter, config_sql, history_sql, success_sql, outdir, conn)[source]
Build query strings for a given analysis type.
- Parameters:
analysis_type (str) – Type of analysis (e.g., “multivariant”, “single”)
duckdb_filter (str) – DuckDB WHERE clause
config_sql (str) – SQL query for config data
history_sql (str) – SQL query for history data
success_sql (str) – SQL query for success data
outdir (str) – Output directory path
conn – DuckDB connection
- Returns:
Dictionary mapping filter strings to tuples of (history_query, config_query, success_query, output_dir, variant_set)
- Return type:
- runscripts.analysis.filter_variant_dicts(variant_set, variant_metadata, sim_data_dict, variant_names)[source]
Filter variant dictionaries to only include variants in the given set.
- runscripts.analysis.load_variant_metadata(config)[source]
Load variant metadata from configured sources.
- Parameters:
config (dict) – Configuration dictionary
- Returns:
Tuple of (variant_metadata, sim_data_dict, variant_names)
- Raises:
KeyError – If experiment_id not in config
AssertionError – If multiple experiment IDs without proper variant_data_dir
- Return type:
tuple[dict[str, dict[int, Any]], dict[str, dict[int, str]], dict[str, str]]
- runscripts.analysis.parse_variant_data_dir(experiment_id, variant_data_dir)[source]
For each experiment ID and corresponding variant sim data directory, load the variant metadata JSON and parse the variant sim data file names to construct mappings from experiments to variants to variant metadata and variant sim_data paths.
- Parameters:
- Returns:
Tuple containing three dictionaries:
( {experiment_id: {variant_id: variant_metadata, ...}, ...}, {experiment_id: {variant_id: variant_sim_data_path, ...}, ...} {experiment_id: variant_name, ...} )
- Return type:
tuple[dict[str, dict[int, Any]], dict[str, dict[int, str]], dict[str, str]]
- runscripts.analysis.run_analysis_loop(config, conn, history_sql, config_sql, success_sql, duckdb_filter, variant_metadata, sim_data_dict, variant_names)[source]
Run the main analysis loop for all configured analysis types.
- Parameters:
config (dict) – Configuration dictionary with analysis_types and options
conn – DuckDB connection
history_sql (str) – SQL query for history data
config_sql (str) – SQL query for config data
success_sql (str) – SQL query for success data
duckdb_filter (str) – DuckDB WHERE clause for filtering data
variant_metadata (dict) – Variant metadata dictionary
sim_data_dict (dict) – Sim data dictionary
variant_names (dict) – Variant names dictionary
- Returns:
{“total_runs”: N, “skipped”: M, “errors”: K}
- Return type:
Dictionary with statistics about analyses run