Skip to content

Commit 189c8fc

Browse files
committed
add test of symmetry
1 parent e4e4724 commit 189c8fc

7 files changed

Lines changed: 212 additions & 27 deletions

File tree

_data/members.yml

Lines changed: 42 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,42 @@
33

44
# Members information. Shown on About page.
55
members:
6-
- name: Olga Chernomor
6+
7+
- name: James Barbetti
78
position:
8-
text: Partition models and phylogenomic search.
9-
img: olga.jpg
10-
gscholar: https://scholar.google.com/citations?user=28f0gdQAAAAJ
9+
text: Software engineering for COVID-19 data
10+
img: james.jpg
11+
gscholar:
1112
social:
1213
- title: home #use for email address
13-
url: http://www.cibiv.at/people/olga/
14+
url: http://bqminh.github.io
15+
16+
- name: Thomas Wong
17+
position:
18+
text: ModelFinder 2
19+
img: thomas.jpg
20+
gscholar:
21+
social:
22+
- title: home #use for email address
23+
url: https://bqminh.github.io/people/wong/
24+
25+
- name: Michael Woodhams
26+
position:
27+
text: Lie Markov models.
28+
img: woodhams.jpg
29+
gscholar: https://scholar.google.com/citations?user=brh1wEkAAAAJ
30+
social:
31+
- title: home #use for email address
32+
url: http://www.utas.edu.au/profiles/staff/maths-physics/michael-woodhams
33+
34+
- name: Robert Lanfear
35+
position:
36+
text: Inspiring ideas and advice.
37+
img: rob.jpg
38+
gscholar: https://scholar.google.com/citations?user=Se6txrMAAAAJ
39+
social:
40+
- title: home #use for email address
41+
url: https://www.robertlanfear.com
1442

1543
- name: Bui Quang Minh
1644
position:
@@ -21,6 +49,15 @@ members:
2149
- title: home #use for email address
2250
url: https://bqminh.github.io
2351

52+
- name: Olga Chernomor
53+
position:
54+
text: Partition models and phylogenomic search.
55+
img: olga.jpg
56+
gscholar: https://scholar.google.com/citations?user=28f0gdQAAAAJ
57+
social:
58+
- title: home #use for email address
59+
url: http://www.cibiv.at/people/olga/
60+
2461
- name: Heiko A. Schmidt
2562
position:
2663
text: Integration of <a href="http://www.tree-puzzle.de">TREE-PUZZLE</a> features.
@@ -49,15 +86,6 @@ members:
4986
- title: envelope #use for email address
5087
url: http://www.cibiv.at/people/haeseler/
5188

52-
- name: Michael Woodhams
53-
position:
54-
text: Lie Markov models.
55-
img: woodhams.jpg
56-
gscholar: https://scholar.google.com/citations?user=brh1wEkAAAAJ
57-
social:
58-
- title: home #use for email address
59-
url: http://www.utas.edu.au/profiles/staff/maths-physics/michael-woodhams
60-
6189
- name: Diep Thi Hoang
6290
position:
6391
text: Improving ultrafast bootstrap.
@@ -67,15 +95,6 @@ members:
6795
- title: envelope #use for email address
6896
url: https://www.researchgate.net/profile/Diep_Hoang6
6997

70-
- name: Robert Lanfear
71-
position:
72-
text: Inspiring ideas and advice.
73-
img: rob.jpg
74-
gscholar: https://scholar.google.com/citations?user=Se6txrMAAAAJ
75-
social:
76-
- title: home #use for email address
77-
url: https://www.robertlanfear.com
78-
7998

8099
##### Past members ####
81100
past_members:
Lines changed: 166 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
---
2+
layout: userdoc
3+
title: "Assessing Phylogenetic Assumptions"
4+
author: M Bui
5+
date: 2021-03-11
6+
docid: 5
7+
icon: info-circle
8+
doctype: tutorial
9+
tags:
10+
- tutorial
11+
description: This guide is about evaluating the suitability of the data for phylogenetic analysis.
12+
---
13+
14+
It is important to know that phylogenetic models rely on various simplifying assumptions to
15+
ease computations. If your data severely violate these assumptions, it might
16+
cause bias in phylogenetic estimates of tree topologies and other model
17+
parameters. Some common assumptions include _treelikeness_ (all sites
18+
in the alignment have evolved under the same tree), _stationarity_ (nucleotide/amino-acid
19+
frequencies remain constant over time), _reversibility_ (substitutions are equally
20+
likely in both directions), and _homogeneity_ (substitution rates remain constant over time).
21+
22+
This document shows several ways to check some of these assumptions that you
23+
should perform before doing phylogenetic analysis.
24+
25+
Likelihood mapping analysis
26+
---------------------------
27+
<div class="hline"></div>
28+
29+
Likelihood mapping ([Strimmer and von Haeseler, 1997]) is a visualisation method
30+
to display the phylogenetic information of an alignment. It visualises the _treelikeness_
31+
of all quartets in a single triangular graph and therefore renders a quick
32+
interpretation of the phylogenetic content.
33+
34+
A simple likelihood mapping analysis can be conducted with:
35+
36+
iqtree -s example.phy -lmap 2000 -n 0
37+
38+
where `-lmap` option specify the number of quartets of taxa that will be drawn randomly
39+
from the alignment. `-n 0` tells IQ-TREE to stop the analysis right after running the
40+
likelihood mapping. IQ-TREE will print the result in the `.iqtree` report file as well
41+
as the likelihood mapping plot `.lmap.svg` (in SVG format) and `.lmap.eps` file (in EPS
42+
figure format).
43+
44+
You can now view the likelihood mapping plot file `example.phy.lmap.svg`, which looks like this:
45+
46+
![Likelihood mapping plot.](images/example.phy.lmap.pdf)
47+
48+
It shows phylogenetic information of the alignment `example.phy`.
49+
50+
* Top sub-figure: distribution of quartets depicted by dots on the likelihood mapping plot.
51+
* Left sub-figure: percentages of quartets falling in each of the three areas. The
52+
three areas show support for one of the different groupings like (a,b)-(c,d).
53+
* Right sub-figure: percentages of quartets falling in each of the seven areas.
54+
Quartets falling into the three corners are informative and called fully-resolved quartets.
55+
Those in three rectangles are partly informative (partly resolved quartets) and those in the center are uninformative
56+
(unresolved quartets). A good data set should have high number of fully resolved quartets
57+
and low number of unresolved quartets.
58+
59+
The meanings can also be found in the `LIKELIHOOD MAPPING STATISTICS` section of the report file `example.phy.iqtree`:
60+
61+
62+
LIKELIHOOD MAPPING STATISTICS
63+
-----------------------------
64+
65+
(a,b)-(c,d) (a,b)-(c,d)
66+
/\ /\
67+
/ \ / \
68+
/ \ / 1 \
69+
/ a1 \ / \ / \
70+
/\ /\ / \/ \
71+
/ \ / \ / /\ \
72+
/ \ / \ / 6 / \ 4 \
73+
/ \/ \ /\ / 7 \ /\
74+
/ | \ / \ /______\ / \
75+
/ a3 | a2 \ / 3 | 5 | 2 \
76+
/__________|_________\ /_____|________|_____\
77+
(a,d)-(b,c) (a,c)-(b,d) (a,d)-(b,c) (a,c)-(b,d)
78+
79+
Division of the likelihood mapping plots into 3 or 7 areas.
80+
On the left the areas show support for one of the different groupings
81+
like (a,b|c,d).
82+
On the right the right quartets falling into the areas 1, 2 and 3 are
83+
informative. Those in the rectangles 4, 5 and 6 are partly informative
84+
and those in the center (7) are not informative.
85+
.....
86+
87+
88+
The [command reference](Command-Reference#likelihood-mapping-analysis) will provide
89+
more options and how to perform 2-, 3-, or 4-cluster likelihood mapping analysis.
90+
91+
92+
Tests of symmetry
93+
-----------------
94+
95+
IQ-TREE provides three matched-pairs tests of symmetry ([Naser-Khdour et al., 2019]) to
96+
test the three assumptions of stationarity, reversibility and homogeneity (SRH).
97+
A simple analysis:
98+
99+
iqtree2 -s example.phy -p example.nex --symtest-only
100+
101+
will perform the three tests of symmetry on every partition of the alignment
102+
and print the result into a `.symtest.csv` file. `--symtest-only` option tells
103+
IQ-TREE to only perform the tests of symmetry and then exit.
104+
In this example the content of `example.nex.symtest.csv` looks like this:
105+
106+
```
107+
# Matched-pair tests of symmetry
108+
# This file can be read in MS Excel or in R with command:
109+
# dat=read.csv('example.nex.symtest.csv',comment.char='#')
110+
# Columns are comma-separated with following meanings:
111+
# Name: Partition name
112+
# SymSig: Number of significant sequence pairs by test of symmetry
113+
# SymNon: Number of non-significant sequence pairs by test of symmetry
114+
# SymPval: P-value for maximum test of symmetry
115+
# MarSig: Number of significant sequence pairs by test of marginal symmetry
116+
# MarNon: Number of non-significant sequence pairs by test of marginal symmetry
117+
# MarPval: P-value for maximum test of marginal symmetry
118+
# IntSig: Number of significant sequence pairs by test of internal symmetry
119+
# IntNon: Number of non-significant sequence pairs by test of internal symmetry
120+
# IntPval: P-value for maximum test of internal symmetry
121+
Name,SymSig,SymNon,SymPval,MarSig,MarNon,MarPval,IntSig,IntNon,IntPval
122+
part1,44,92,0.475639,50,86,0.722371,4,132,0.23869
123+
part2,43,93,0.142052,49,87,0.205232,5,131,0.169618
124+
part3,53,83,0.00499855,58,78,0.00164132,6,130,0.343127
125+
```
126+
127+
The three important columns are:
128+
129+
* SymPval: a small p-value (say < 0.05) indicates that the assumptions of stationarity
130+
or homogeneity or both is rejected. In this case, partition `part3` does not comply with these
131+
two assumptions (p-value = 0.00499855), whereas the other two partitions are "good".
132+
* MarPval: a small p-value means that the assumption of stationarity is rejected. In
133+
this case, only partition `part3` does not comply with the stationary condition (p-value = 0.00164132).
134+
* IntPval: a small p-value means that the homogeneity assumption is reject. In
135+
this case, no partitions are "bad" according to this test, i.e., they all comply with
136+
the homogeneity assumption.
137+
138+
This little example shows that only `part3` is problematic by not complying with the
139+
stationary assumption.
140+
141+
Now you may want to perform the phylogenetic analysis excluding all "bad" partitions by:
142+
143+
iqtree2 -s example.phy -p example.nex --symtest-remove-bad
144+
145+
that will remove all "bad" partitions with SymPval < 0.05 and continue the analysis with the
146+
remaining "good" partitions. You may then compare the trees from "all" partitions
147+
and from "good" only partitions to see if there is significant difference between them
148+
with [tree topology tests](Advanced-Tutorial#tree-topology-tests).
149+
150+
Other options can be seen when running `iqtree2 -h`:
151+
152+
```
153+
TEST OF SYMMETRY:
154+
--symtest Perform three tests of symmetry
155+
--symtest-only Do --symtest then exist
156+
--symtest-remove-bad Do --symtest and remove bad partitions
157+
--symtest-remove-good Do --symtest and remove good partitions
158+
--symtest-type MAR|INT Use MARginal/INTernal test when removing partitions
159+
--symtest-pval NUMER P-value cutoff (default: 0.05)
160+
--symtest-keep-zero Keep NAs in the tests
161+
```
162+
163+
164+
[Strimmer and von Haeseler, 1997]: http://www.pnas.org/content/94/13/6815.long
165+
[Naser-Khdour et al., 2019]: https://doi.org/10.1093/gbe/evz193
166+

doc/Command-Reference.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ layout: userdoc
33
title: "Command Reference"
44
author: Diep Thi Hoang, Dominik Schrempf, Heiko Schmidt, Jana Trifinopoulos, M Bui, Minh Bui
55
date: 2020-04-24
6-
docid: 7
6+
docid: 8
77
icon: book
88
doctype: tutorial
99
tags:

doc/Concordance-Factor.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ layout: userdoc
33
title: "Concordance Factor"
44
author: M Bui
55
date: 2020-05-08
6-
docid: 5
6+
docid: 6
77
icon: info-circle
88
doctype: tutorial
99
tags:

doc/Dating.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ layout: userdoc
33
title: "Phylogenetic Dating"
44
author: M Bui, Rob Lanfear
55
date: 2020-06-03
6-
docid: 6
6+
docid: 7
77
icon: info-circle
88
doctype: tutorial
99
tags:

doc/convertwiki.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ cd $source_dir
1212

1313
file_changed=0
1414

15-
files="Front.md Home.md Quickstart.md Web-Server-Tutorial.md Tutorial.md Advanced-Tutorial.md Concordance-Factor.md Dating.md Command-Reference.md Substitution-Models.md Complex-Models.md Polymorphism-Aware-Models.md Compilation-Guide.md Frequently-Asked-Questions.md"
15+
files="Front.md Home.md Quickstart.md Web-Server-Tutorial.md Tutorial.md Advanced-Tutorial.md Assessing-Phylogenetic-Assumptions.md Concordance-Factor.md Dating.md Command-Reference.md Substitution-Models.md Complex-Models.md Polymorphism-Aware-Models.md Compilation-Guide.md Frequently-Asked-Questions.md"
1616

1717
for f in *.md workshop/*.md; do
1818
if [ "$f" == "_Footer.md" -o "$f" == "_Sidebar.md" ]; then

doc/iqtree-doc.pdf

14.7 KB
Binary file not shown.

0 commit comments

Comments
 (0)