Skip to content

Commit 064cc58

Browse files
committed
Introduce GTRX with thorough warnings; Introduce the MkA model
1 parent cbb9360 commit 064cc58

2 files changed

Lines changed: 28 additions & 13 deletions

File tree

doc/Command-Reference.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
---
22
layout: userdoc
33
title: "Command Reference"
4-
author: Hector Banos, Diep Thi Hoang, Dominik Schrempf, Heiko Schmidt, Jana Trifinopoulos, Minh Bui, Thomas Wong, Nhan Ly-Trong
5-
date: 2025-03-28
4+
author: Hector Banos, Diep Thi Hoang, Dominik Schrempf, Heiko Schmidt, Jana Trifinopoulos, Minh Bui, Thomas Wong, Nhan Ly-Trong, Hiroaki Sato
5+
date: 2025-05-26
66
docid: 19
77
icon: book
88
doctype: manual
@@ -295,7 +295,7 @@ The following `MODEL`s are available:
295295
| Protein | Mixture models: C10, ..., C60 (CAT model) ([Lartillot and Philippe, 2004]), EX2, EX3, EHO, UL2, UL3, EX_EHO, LG4M, LG4X, CF4. See [Protein models](Substitution-Models#protein-models) for more details. |
296296
| Codon | MG, MGK, MG1KTS, MG1KTV, MG2K, GY, GY1KTS, GY1KTV, GY2K, ECMK07/KOSI07, ECMrest, ECMS05/SCHN05 and combined empirical-mechanistic models. See [Codon models](Substitution-Models#codon-models) for more details. |
297297
| Binary | JC2, GTR2. See [Binary and morphological models](Substitution-Models#binary-and-morphological-models) for more details. |
298-
| Morphology | MK, ORDERED. See [Binary and morphological models](Substitution-Models#binary-and-morphological-models) for more details. |
298+
| Morphology | MK, (GTRX), ORDERED. WARNING: GTRX (which can also be invoked as GTR) can only be applied to data with non-arbitrary state labels (e.g., recoded amino acids [for practical application, see [Najle et al., 2023]; [xgrau/recoded-mixture-models]] and certain types of genomic information) and should **never** be used for general morphological characters (transformational morphological characters; for the term, see [Sereno, 2007]). See [Binary and morphological models](Substitution-Models#binary-and-morphological-models) for more details. |
299299

300300
The following `FreqType`s are supported:
301301

@@ -802,4 +802,7 @@ The first few lines of the output file example.phy.sitelh (printed by `-wslr` op
802802
[Strimmer and von Haeseler, 1997]: http://www.pnas.org/content/94/13/6815.long
803803
[Yang, 1994]: https://doi.org/10.1007/BF00160154
804804
[Yang, 1995]: http://www.genetics.org/content/139/2/993.abstract
805+
[Najle et al., 2023]: https://doi.org/10.1016/j.cell.2023.08.027
806+
[xgrau/recoded-mixture-models]: https://github.com/xgrau/recoded-mixture-models
807+
[Sereno, 2007]: https://doi.org/10.1111/j.1096-0031.2007.00161.x
805808

doc/Substitution-Models.md

Lines changed: 22 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
---
22
layout: userdoc
33
title: "Substitution Models"
4-
author: Hector Banos, Cuong Cao Dang, Heiko Schmidt, Jana Trifinopoulos, Minh Bui, Nhan Ly-Trong
5-
date: 2024-05-14
4+
author: Hector Banos, Cuong Cao Dang, Heiko Schmidt, Jana Trifinopoulos, Minh Bui, Nhan Ly-Trong, Hiroaki Sato
5+
date: 2024-05-26
66
docid: 10
77
icon: book
88
doctype: manual
@@ -360,16 +360,24 @@ Binary and morphological models
360360

361361
The binary alignments should contain state `0` and `1`, whereas for morphological data, the valid states are `0` to `9` and `A` to `Z`.
362362

363-
| Model | Explanation |
364-
|---------|------------------------------------------------------------------------|
365-
| JC2 | Jukes-Cantor type model for binary data.|
366-
| GTR2 | General time reversible model for binary data.|
367-
| MK | Jukes-Cantor type model for morphological data.|
368-
| ORDERED | Allowing exchange of neighboring states only.|
363+
| Model | Explanation |
364+
|------------|------------------------------------------------------------------------|
365+
| JC2 | Jukes-Cantor type model for binary data.|
366+
| GTR2 | General time reversible model for binary data.|
367+
| MK | Jukes-Cantor type model for morphological data with equal rates.|
368+
| GTRX (GTR) | General time reversible model for morphological (or rather, multistate; **see the warning below**) data with unequal rates.|
369+
| ORDERED | Allowing exchange of neighboring states only.|
370+
371+
Except for `GTR2` that has unequal state frequencies, all other models have equal state frequencies. Users can change how state frequencies are modeled in morphological models by appending `+FQ`, `+F`, `+F{...}`, or `+FO`.
372+
373+
> **WARNING**: Models with unequal rates and/or frequencies (e.g., `GTR2+FO`, `MK+FO`, `GTRX+FQ`, `GTRX+FO`) should **never** be applied to general morphological characters (transformational morphological characters; for the term, see [Sereno, 2007]) as their state labels are fundamentally arbitrary. These models are for data with non-arbitrary state labels (e.g., recoded amino acids [for practical application, see [Najle et al., 2023]; [xgrau/recoded-mixture-models]] and certain types of genomic information).
369374
370-
Except for `GTR2` that has unequal state frequencies, all other models have equal state frequencies.
375+
> **WARNING**: If you use `GTRX` for your multistate data, because of its sometimes very great number of free parameters, please make sure your data are sufficiently large and always test for model fit.
371376
372377

378+
> **TIP**: For binary morphological characters where `0`s represent ancestral conditions and `1`s represent derived conditions, mainly neomorphic (`absent`/`present`) morphological characters (for the term, see [Sereno, 2007]), applying the `GTR2` model, with unequal state frequencies, would make sense (see e.g. [Pyron, 2017]; [Sun et al., 2018]; https://ms609.github.io/hyoliths/bayesian.html). This analytical condition is called the MkA model ([Pyron, 2017]).
379+
{: .tip}
380+
373381
>**TIP**: If morphological alignments do not contain constant sites (typically the case), then [an ascertainment bias correction model (`+ASC`)](#ascertainment-bias-correction) should be applied to correct the branch lengths for the absence of constant sites.
374382
{: .tip}
375383

@@ -462,5 +470,9 @@ Users can fix the parameters of the model. For example, `+I{0.2}` will fix the p
462470
[Yang, 1995]: http://www.genetics.org/content/139/2/993.abstract
463471
[Yang et al., 1998]: http://mbe.oxfordjournals.org/content/15/12/1600.abstract
464472
[Zharkikh, 1994]: https://doi.org/10.1007/BF00160155
465-
473+
[Sereno, 2007]: https://doi.org/10.1111/j.1096-0031.2007.00161.x
474+
[Pyron, 2017]: https://doi.org/10.1093/sysbio/syw068
475+
[Sun et al., 2018]: https://doi.org/10.1098/rspb.2018.1780
476+
[xgrau/recoded-mixture-models]: https://github.com/xgrau/recoded-mixture-models
477+
[Najle et al., 2023]: https://doi.org/10.1016/j.cell.2023.08.027
466478

0 commit comments

Comments
 (0)