Skip to content

klamt-lab/Modeling_Awoodii

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Code for Modeling and Thermodynamic Analysis of Acetobacterium woodii

This repository hosts thermodynamics-incorporating models the acetogenic organism Acetobacterium woodii. It combines curated SBML models, concentration and ΔG° datasets, thermodynamic validation workflows, and tooling to run large simulation campaigns.

This work was implemented by Jasmin Bauer (Orcid: 0009-0004-2014-7293; MPI for Dynamics of Complex Technical Systems). With support from Axel von Kamp (MPI for Dynamics of Complex Technical Systems) for the model implementation part.

The code is also available on zenodo: DOI


1. Quick Start (Installation)

This project uses uv for fast, reproducible Python environment management.

Run these commands from the project root directory:

# 1. Create virtual environment (Python 3.10)
uv venv acemod -p 3.10

# 2. Activate environment
source acemod/bin/activate

# 3. Install dependencies
uv pip install -r pyproject.toml

2. Manual Step: CPLEX Installation

CPLEX is proprietary and cannot be installed automatically from PyPI. You must install your local copy into the environment. The version used in this study is 22.1.1.0.

With the acemod environment active:

# Adjust path to match your specific version/location
uv pip install <Path to cplex setup.py (e.g. ~/cplex/python/3.10/x86-64_linux/)> 

3. Manual Step: Gurobi Installation

gurobipy is automatically installed when generating the virtual environment. The only thing left is now checking for the gurobi license that can be aquired over the official gurobi website.

The License can be saved under your home directory.


Repository Layout at a Glance

Path Purpose
Models/ Curated SBML models (wildtype, TCOSA, preliminiary models). Contains also Maps folder
Data/ Concentration bounds, ΔG° datasets, cofactor/central reaction lists.
config/ YAML component library (models, environments, analyses, campaigns).
functions/ Python package with core solver, experiment, and campaign logic.
standalone_scripts/ CLI utilities for running campaigns, plotting, diagnostics.
Results/  Output directory for campaigns and analyses.
Scenarios/ Output directory for stoichiometric validation results (in CNApy scenario file format)
Notes_Files/ Stores supplement excel files
ModelSEEDDatabase/Biochemistry/ Stores files needed for model implementation and was downloaded from:ModelSEEDDatabase GitHub
ComputerClusterFiles/ Scripts used for server calculations.

Model Assets

Wildtype & Variants

File Notes
ECC2.sbml E. coli model from [1]
AW_ext.xml A. woodii model not charge balanced from [2]
Cautoethanogenum_V2.xml.gz C. autoethanogenum model from [3]
Cautoethanogenum_V2_BiGG.xml.gz C. autoethanogenum model with BiGG IDs
A_woodii.xml.gz A. woodii base model (with heterologous pathways)
A_woodii_cleaned.xml.gz IrreversibleA. woodii model
A_woodii_basic.xml.gz A. woodii base model (without heterologous pathways)
A_woodii_TCOSA.xml.gz A. woodii TCOSA base model
A_woodii_TCOSA_cleaned.xml.gz IrreversibleA. woodii TCOSA model

The other models are preliminary models from the generation of the final A_woodii.xml.gz

Under ./Models/Maps:

File Notes
MapAwoodii.json A. woodii central metabolism for Escher Map in CNApy
MapAwoodii.svg A. woodii central metabolism for standard Map in CNApy
MapAwoodii_basic.json A. woodii central metabolism (no heterologous pathways) for Escher Map in CNApy
MapAwoodii.svg A. woodii central metabolism (no heterologous pathways) for standard Map in CNApy

CNApy projects

File Notes
Awoodii.cna A. woodii CNApy project (with heterologous pathways)
Awoodii_basic.cna A. woodii CNApy project (without heterologous pathways)
Awoodii_TCOSA.cna A. woodii TCOSA CNApy project

Generation Pipeline (Notebooks)

  1. Draft to cleaned wildtype

    • ModelImplementation_1_a_woodii_from_ECC2.ipynb → ECC2-based draft.
    • ModelImplementation_2_change_to_seed_formulas_charges.ipynb → updates formulas & charges from ModelSEED.
    • ModelImplementation_3_make_c_auto_BiGG.ipynb → ensures BiGG compatibility.
    • ModelImplementation_4_change_to_c_auto_biomass.ipynb → adapts biomass reaction.
    • ModelImplementation_5_make_irreversible_and_investigate.ipynb → adds InChI keys, builds irreversible cleaned model, exports central/cofactor reaction sets.
  2. TCOSA model generation and thermodynamic tuning

    • ModelImplementation_6_generate_TCOSA_models.ipynb → generate TCOSA model and create json lists of cofactor reaction IDs
    • ModelImplementation_7_thermodynamic_bottlenecks.ipynb → thermodynamic bottleneck detection/relief
  3. Stoichiometric validation

    • ModelValidation_stoichiometry.ipynb - stoichiometric consistency checks and flux sanity tests.
    • ./Scenarios stores the CNApy scenario files for those phenotypes

Data Assets

All files under ./Data/

Concentration files

File Notes
concentrations_awoodii_h2_co2_narrow.json Tight oncentration ranges for CO2 and H2. Only used for thermodynamic bottleneck calculations in notebook
ModelImplementation_7_thermodynamic_bottlenecks.ipynb
A. woodii should be able to have a positive MDF till those concentration limits.
concentrations_awoodii_h2_co2_medium.json Bounds used for metabolic engineering optimizations.
concentrations_awoodii_h2_co2_wide.json Bounds used for MDFScan calculations.
config.yml Config for thermodynamic bottleneck analyses in ModelImplementation_7_thermodynamic_bottlenecks.ipynb

dG0 related files

File Notes
wildtype_ph7.0_pot-0.15_dG0.json 'Raw' dG0 values forA. woodii from eQuilibrator (before bottleneck analysis)
wildtype_tcosa_ph7.0_pot-0.15_dG0.json 'Raw' dG0 values forA. woodii TCOSA from eQuilibrator (before bottleneck analysis)
dG0_wildtype.json Final dG0 values forA. woodii from eQuilibrator (after bottleneck analysis)
dG0_wildtype_tcosa.json Final dG0 values forA. woodii TCOSA from eQuilibrator (after bottleneck analysis)
Report_dG0_wildtype.xlsx Excel report of dG0 forA. woodii
Report_dG0_tcosa.xlsx Excel report of dG0 forA. woodii TCOSA
Report_dG0.xlsx Combined report of wildtype and TCOSA

Other files

File Notes
awoodii_cofactor_reaction_IDs.json Cofactor reaction base IDs
awoodii_cofactor_reactions_tcosa_model.json Cofactor reactions of TCOSA model
tcosa_variant_reactions_multiple_cofactors.json Special cofactor reactions in need of special treatment for ModelImplementation_6_... notebook where TCOSA model is created.

Running Campaigns/Analyses

Configuration System

This framework uses a Modular Configuration System. You do not define settings in Python scripts; instead, you define Components in YAML files located in the config/ directory.

The YAML library under config/ follows a component pattern:

  • models.yml — SBML file references, model adjustments, subnetwork definitions.
  • environments.yml — Substrate/product bounds, concentration files, scenario tweaks.
  • analyses.yml — Experiment blocks (MDF scans, characterizations, phenotypes, parameter scans).
  • main.yml — Campaign definitions combining components.

functions/utilities.py::load_settings_components loads all YAML anchors/aliases; assemble_settings merges component lists into a run-ready dict.

A "Task" is defined by a list of component names, e.g., ["MyModel", "MyEnvironment", "MyAnalysis"].

A. Metabolite Scan Experiment (2D Landscape)

Purpose: Generates a heatmap showing how an objective (e.g., MDF or Growth) changes as two metabolite concentrations vary.

  1. Setup Config:

    • In config/main.yml the campaign Biomass_Landscape_CO2_vs_H2_Parallel is defined. The two tasks are run separate since one task already has 10000 fragments to generate and compute.
    • config/analyses.yml contains a MetaboliteScan block called Biomass_Landscape_CO2_vs_H2_Parallel with the defining metabolite_x and metabolite_y ranges and other settings.
  2. Run on server

  3. Output: For parallel execution you get all the single runs as output. See Results/MetaboliteScan_Biomass_WideBounds_Parallel/<Task folder>/fragments .

  4. Visualization: A 2D Heatmap (HTML) showing feasibility regions and an aggregated CSV of all grid points is then generated by standalone_scripts/run_plotting_metabolite_scan_batch.py and plots are stored in the Results/MetaboliteScan_Biomass_WideBounds_Parallel folder

B. MDF Scan Experiment

Purpose: Systematically varies a constraint (like ATP Maintenance or Biomass) to find the maximum thermodynamic driving force (MDF) at each point.

  1. Setup Config:
    • In config/main.yml the campaign Growth_Scans_All_Models_WideBounds_CofactorRatios is defined
    • config/analyses.yml contains an MDFScan blocks called MDF_Scan_Biomass_CofactorRatios and MDF_Scan_Biomass_CofactorRatios_Extended with settings.
  2. Run on server
  3. Output: Results are stored in Results/Growth_Scans_All_Models_WideBounds_CofactorRatios/.
  4. Visualization: The plots are then generated in Plots_MDFScans.ipynb

Its possible to run selected scenarios through standalone_scripts/run_selected_mdf_scan_fractions_and_diagnostics.py or full campaign through standalone_scripts/run_campaign.py

C. Characterization Experiment

Purpose: Runs a fixed sequence of optimizations (e.g., Max Product, Max MDF, Min Acetate) to profile a strain's capabilities with different substrate/product combinations.

  1. Setup Config:
    • In config/main.yml the campaigns Characterization_MetEng_swaps and Characterization_MetEng_swap1 are defined.
    • config/analyses.yml contains a Characterization block defining Selected_Characterization and Selected_Characterization_1swap.
  2. Run on server
  3. Output: Results are stored in Results/Characterization_MetEng_swaps/ and Results/Characterization_MetEng_swap1/
  4. Visualization: The plots are generated in standalone_scripts/run_reporting_meteng.py through settings defined in reporting.yml

Its possible to run selected scenarios through standalone_scripts/run_selected_characterization.py or full campaign through standalone_scripts/run_campaign.py


References

[1] EColiCore2: a reference network model of the central metabolism of Escherichia coli and relationships to its genome-scale parent model

[2] A quantitative metabolic analysis reveals Acetobacterium woodii as a flexible and robust host for formate-based bioproduction

[3] Maintenance of ATP Homeostasis Triggers Metabolic Shifts in Gas-Fermenting Acetogens

[4] eQuilibrator 3.0: a database solution for thermodynamic constant estimation

For further questions or contributions, please open an issue or submit a PR with your proposed changes. Happy modeling!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors