Skip to content

Commit c504c50

Browse files
committed
Merge branch 'master' of https://github.com/iqtree/iqtree2.wiki
2 parents d1adc2e + 267213e commit c504c50

15 files changed

Lines changed: 717 additions & 134 deletions

doc/AliSim.md

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -55,11 +55,20 @@ Sequence simulators play an important role in phylogenetics. Simulated data has
5555

5656
To use AliSim please make sure that you download the IQ-TREE version 2.2.0 or later.
5757

58-
If you use AliSim please cite the following paper(s):
58+
If you use AliSim please cite:
5959

60-
- Nhan Ly-Trong, Suha Naser-Khdour, Robert Lanfear, Bui Quang Minh, AliSim: A Fast and Versatile Phylogenetic Sequence Simulator for the Genomic Era, Molecular Biology and Evolution, Volume 39, Issue 5, May 2022, msac092, <https://doi.org/10.1093/molbev/msac092>
60+
- Nhan Ly-Trong, Giuseppe M.J. Barca, Bui Quang Minh (2023)
61+
AliSim-HPC: parallel sequence simulator for phylogenetics.
62+
Bioinformatics, Volume 39, Issue 9, btad540.
63+
<https://doi.org/10.1093/bioinformatics/btad540>
64+
65+
For the original algorithms of AliSim please cite:
66+
67+
- Nhan Ly-Trong, Suha Naser-Khdour, Robert Lanfear, Bui Quang Minh (2022)
68+
AliSim: A Fast and Versatile Phylogenetic Sequence Simulator for the Genomic Era.
69+
_Molecular Biology and Evolution_, Volume 39, Issue 5, msac092.
70+
<https://doi.org/10.1093/molbev/msac092>
6171

62-
- Nhan Ly-Trong, Giuseppe M.J. Barca, Bui Quang Minh, AliSim-HPC: parallel sequence simulator for phylogenetics, Bioinformatics, Volume 39, Issue 9, Sep 2023, btad540, <https://doi.org/10.1093/bioinformatics/btad540> (*for the parallel version*)
6372

6473

6574
Simulating an alignment from a tree and model

doc/Command-Reference.md

Lines changed: 47 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,8 @@ sections:
2828
url: site-specific-frequency-model-options
2929
- name: Tree search parameters
3030
url: tree-search-parameters
31+
- name: Tree search for pathogen data
32+
url: tree-search-for-pathogen-data
3133
- name: Ultrafast bootstrap parameters
3234
url: ultrafast-bootstrap-parameters
3335
- name: Nonparametric bootstrap
@@ -328,17 +330,16 @@ Further options:
328330

329331
| Option | Usage and meaning |
330332
|----------|------------------------------------------------------------------------------|
331-
| `--link-exchange-rates` | Turn on linked exchangeability estimation for a profile mixture model. Note that the model must have specified `GTR20` exchangeabilities for eg.`GTR20+C20+G`. |
332-
| `--gtr20-model` | Specify the initial exchangeabilities for linked exchangeability estimation. Note that this must be used with `--link-exchange-rates.` |
333-
| `--rates-file` | Produces a nexus file with the exchangeability matrix obtained from the optimization. This file can be later used for phylogenetic inference with the use of the `-mdef` flag |
333+
| `--link-exchange` | Turn on linked exchangeability estimation for a profile mixture model. Note that the model must have specified `GTR20` exchangeabilities for eg.`GTR20+C20+G`. This option also produces a nexus file `GTRPMIX.nex` with the exchangeability matrix obtained from the optimization. This file can be later used for phylogenetic inference with the use of the `-mdef` flag|
334+
| `--init-exchange` | Specify the initial exchangeabilities for linked exchangeability estimation. Note that this must be used with `--link-exchange`. |
334335

335336
### Example usages:
336337

337338
* Estimate linked exchangeabilities for a protein alignment `prot.phy` under C60+G model and a guide tree `guide.treefile`, where optimization is initialized from LG exchangeabilities
338339

339-
iqtree -s prot.phy -m GTR20+C60+G --link-exchange-rates --gtr20-model LG -te guide.treefile
340+
iqtree -s prot.phy -m GTR20+C60+G --link-exchange --init-exchange LG -te guide.treefile
340341

341-
>**NOTE**: For better and faster performance, read the [recommendations](Complex-Models#linked-gtr-exchangeabilities-models) provided in the Complex Models section.
342+
>**NOTE**: For better and faster performance, read the [recommendations](Estimating-amino-acid-substitution-models#estimating-linked-exchangeabilities) provided in the Estimating amino acid substitution models section.
342343
343344

344345
Rate heterogeneity
@@ -432,6 +433,46 @@ The new IQ-TREE search algorithm ([Nguyen et al., 2015]) has several parameters
432433

433434
iqtree -s data.phy -m TEST -g constraint.tree
434435

436+
Tree search for pathogen data
437+
-----------------------------
438+
<div class="hline"></div>
439+
440+
For pathogen data such as SARS-CoV-2 virus alignments, version 2.3.4.cmaple implements
441+
the MAPLE algorithm ([De Maio et al., 2023]) that performs tree search very quickly by
442+
exploiting the low divergent property of the sequences (i.e., sequences in the alignment
443+
are very similar to each other).
444+
445+
| Option | Usage and meaning |
446+
|----------|------------------------------------------------------------------------------|
447+
| `--pathogen` | Apply CMAPLE tree search algorithm if sequence divergence is low, otherwise, apply IQ-TREE algorithm. |
448+
| `--pathogen-force` | Apply CMAPLE tree search algorithm regardless of sequence divergence. |
449+
| `-alrt` | Specify number of replicates (>=1000) to perform SH-like approximate likelihood ratio test (SH-aLRT) ([Guindon et al., 2010]). |
450+
| `-T` | Specify the number of CPU cores to use only for the SH-aLRT test. If `-T AUTO` is specified, IQ-TREE will use all available cores. NOTE: this option has no effect on tree search, which is still single-threaded. |
451+
452+
### Example usages:
453+
454+
* Infer a maximum-likelihood tree for an alignment, automatically switching to CMAPLE algorithm
455+
if sequence divergence is low:
456+
457+
iqtree2 -s data.phy --pathogen --prefix pathogen
458+
459+
It will print two output files:
460+
461+
* `pathogen.treefile`: The best approximate maximum-likelihood tree in NEWICK format.
462+
* `pathogen.log`: The log file.
463+
464+
465+
If you want to do other analyses on this tree and thus saving the tree search time,
466+
add `-te pathogen.treefile` to the command line of a subsequent IQ-TREE run to fix this tree topology
467+
and remove `--pathogen` option to invoke the default IQ-TREE machinery.
468+
469+
* Infer a tree like above and additionally assign branch supports using SH-aLRT test
470+
with 1000 replicates using 4 CPU cores:
471+
472+
iqtree2 -s data.phy --pathogen --alrt 1000 -T 4 --prefix pathogen
473+
474+
The tree `pathogen.treefile` will contain branch supports for all internal branches.
475+
435476
Ultrafast bootstrap parameters
436477
------------------------------
437478
<div class="hline"></div>
@@ -730,6 +771,7 @@ The first few lines of the output file example.phy.sitelh (printed by `-wslr` op
730771
[Adachi and Hasegawa, 1996b]: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.476.8552
731772
[Anisimova and Gascuel 2006]: https://doi.org/10.1080/10635150600755453
732773
[Anisimova et al., 2011]: https://doi.org/10.1093/sysbio/syr041
774+
[De Maio et al., 2023]: https://doi.org/10.1038/s41588-023-01368-0
733775
[Felsenstein, 1985]: https://doi.org/10.2307/2408678
734776
[Flouri et al., 2015]: https://doi.org/10.1093/sysbio/syu084
735777
[Gadagkar et al., 2005]: https://doi.org/10.1002/jez.b.21026

doc/Compilation-Guide.md

Lines changed: 25 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ For IQ-TREE version 1 please use:
6666

6767
Alternatively, if you have `git` installed, you can also clone the source code from GitHub with:
6868

69-
git clone https://github.com/iqtree/iqtree2.git
69+
git clone --recursive https://github.com/iqtree/iqtree2.git
7070

7171
For IQ-TREE version 1 please clone:
7272

@@ -108,16 +108,15 @@ Compiling under Linux
108108

109109
This creates an executable `iqtree2` (`iqtree` for version 1). It can be copied to your system search path so that IQ-TREE can be called from the Terminal simply with the command line `iqtree2`.
110110

111+
To compile IQ-TREE under Linux with ARM processor, use either GCC 10 (but not above), or Clang 14 or above.
112+
111113
>**TIP**: The above guide typically compiles IQ-TREE with `gcc`. If you have Clang installed and want to compile with Clang, the compilation will be similar to Mac OS X like below.
112114
{: .tip}
113115

114116
Compiling under Mac OS X
115117
------------------------
116118
<div class="hline"></div>
117119

118-
>**TIP**: A ready made IQ-TREE package is provided by * [Homebrew](https://github.com/brewsci/homebrew-science/blob/master/Formula/iqtree.rb) by simply running `brew install homebrew/science/iqtree2`.
119-
{: .tip}
120-
121120
* Make sure that Clang compiler is installed, which is typically the case if you installed Xcode and the associated command line tools.
122121

123122
* If you installed cmake with Homebrew
@@ -130,13 +129,18 @@ The steps to compile IQ-TREE are similar to Linux (see above), except that you n
130129

131130
(please change `cmake` to absolute path like `/Applications/CMake.app/Contents/bin/cmake`).
132131

133-
To compile the multicore version, the default installed Clang unfortunately does not support OpenMP (which might change in the near future). However, the latest Clang 3.7 supports OpenMP, which can be downloaded from <http://clang.llvm.org>. After that you can run CMake with:
132+
* To compile IQ-TREE under Mac with ARM processor, use Clang 17 or above.
133+
134+
* If the OpenMP include or lib files cannot be found, then you can specify the location of OpenMP include or lib files, for example:
134135

135-
cmake -DIQTREE_FLAGS=omp -DCMAKE_C_COMPILER=clang-3.7 -DCMAKE_CXX_COMPILER=clang++-3.7 ..
136+
export LDFLAGS="-L/opt/homebrew/opt/libomp/lib"
136137

137-
(assuming that `clang-3.7` and `clang++-3.7` points to the installed Clang 3.7).
138+
export CPPFLAGS="-I/opt/homebrew/opt/libomp/include"
138139

140+
cmake -DCMAKE_CXX_FLAGS="$LDFLAGS $CPPFLAGS" -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ ..
139141

142+
(please change the path to the installed location of your OpenMP library)
143+
140144
Compiling under Windows
141145
-----------------------
142146
<div class="hline"></div>
@@ -257,6 +261,19 @@ The compiled `iqtree` binary will automatically choose the proper computational
257261
IQ-TREE multicore Xeon Phi KNL version 1.6.beta for Linux 64-bit built May 7 2017
258262

259263

264+
Compiling IQ-TREE2 lib file
265+
---------------------------
266+
<div class="hline"></div>
267+
268+
Starting with version 2.3.3, you can compile and create IQ-TREE2 lib file.
269+
270+
If you want to compile the IQ-TREE2 lib file, simply run:
271+
272+
cmake -DBUILD_LIB=ON ..
273+
make -j4
274+
275+
276+
<!--
260277
Compling with deep learning kernel for ModelFinder 2
261278
--------------------------------------------------
262279
@@ -280,7 +297,7 @@ where 1.11.0 is the version of onnxruntime at the time of writing this document.
280297
Now you will need to run cmake by additional options:
281298
282299
cmake -Donnxruntime_INCLUDE_DIRS=/usr/local/Cellar//onnxruntime/1.11.0/include/onnxruntime/core/session/ -Donnxruntime_LIBRARIES=/usr/local/Cellar//onnxruntime/1.11.0/lib/libonnxruntime.dylib ..
283-
300+
-->
284301

285302
About precompiled binaries
286303
--------------------------

doc/Complex-Models.md

Lines changed: 3 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,6 @@ sections:
1414
url: partition-models
1515
- name: Mixture models
1616
url: mixture-models
17-
- name: Linked GTR exchangeabilities models
18-
url: linked-gtr-exchangeabilities-models
1917
- name: Site-specific frequency models
2018
url: site-specific-frequency-models
2119
- name: Heterotachy models
@@ -177,14 +175,15 @@ Options for ModelFinder also work for MixtureFinder, e.g.:
177175
The `-mset HKY,GTR` means we select subtitution model type among only `HKY` and `GTR` substitution models in each iteration of adding one more class. The `-mrate E,I,G,I+G` means we select the rate heterogeneity across sites models among `+E`, `+I`, `G` and `+I+G` models.
178176

179177
Other options for MixtureFinder:
178+
180179
| Model option | Description |
181180
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------ |
182181
| `-qmax` | Maximum number of Q-mixture classes (default: 10). Specify a number after the option (e.g., `-qmax 5`). |
183182
| `-mrate-twice` | Whether estimate the rate heterogeneity across sites models again after select the best Q-mixture model. 1: yes, 0: no. (default: 0) |
184183

185184
If you use MixtureFinder in a publication please cite:
186185

187-
> __H. Ren, T.K.F. Wong, B.Q. Minh, R. Lanfear__ (2024) MixtureFinder: Estimating DNA mixture models for phylogenetic analyses. _BioRxiv_. https://doi.org/10.1101/2024.03.20.586035
186+
> __H. Ren, T.K.F. Wong, B.Q. Minh, R. Lanfear__ (2024) MixtureFinder: Estimating DNA mixture models for phylogenetic analyses. _BioRxiv_. <https://doi.org/10.1101/2024.03.20.586035>
188187
189188

190189

@@ -204,36 +203,6 @@ Sometimes one only wants to model the changes in nucleotide or amino-acid freque
204203

205204
>**NOTE**: The amino-acid order in this file is: A R N D C Q E G H I L K M F P S T W Y V.
206205
207-
Linked GTR exchangeabilities models
208-
---------------------------------------
209-
<div class="hline"></div>
210-
211-
Starting with version 2.3.1, IQ-TREE allows the user to estimate exchangeabilities under profile mixture models.
212-
213-
### Exchangeability estimation
214-
215-
To start with, we show an example:
216-
217-
iqtree -s <alignment> -m GTR20+C60+G4 --link-exchange-rates -te <guide_tree> -me 0.99
218-
219-
In this example exchangeabilities will be estimated for a profile mixture model `C60+G4` but any profile mixture model and rates can be used. To estimate a single set of linked exchangeabilities, in the model definition the matrix `GTR20` must be specified (resp. GTR for nucleotide data) together with the flag `--link-exchange-rates`. While a guide tree is not needed, we highly recommend using a fixed tree topology to estimate exchangeabilities. Since matrix estimation can be time-consuming, we also recommend using the flag `-me 0.99` to reduce the optimization threshold for faster optimization. Simulations have shown that changing this parameter has no significant effect on exchangeability estimation.
220-
221-
The user can determine the starting exchangeabilities before optimization. Choosing adequate exchangeabilities can make estimation considerably faster. For example:
222-
223-
iqtree -s example.phy -m GTR20+C60+G4 --link-exchange-rates --gtr20-model LG -te <guide_tree> -me 0.99
224-
225-
specifies the LG matrix as the starting matrix via the flag `--gtr20-model` (the default starting matrix is POISSON, i.e. equal exchangeabilities). For this flag, the user can specify any matrix, even those matrices defined by the user via the `-mdef` flag. If the user is agnostic of the exchangeabilities, we recommend using the default matrix (although it can be time-consuming).
226-
227-
Note that the user can estimate exchangeabilities jointly with weights of the profiles, branch lengths, and rates. This can be very time-consuming. If the goal is to optimize exchange abilities, one can fix the other parameters to reasonable estimates (for eg. fixing branch lengths and rates has been shown to perform adequately for estimation of exchangeabilities)
228-
229-
There is an additional flag `--rates-file` that will produce a nexus file with the exchangeability matrix obtained from the optimization. This file can be later used for phylogenetic inference with the use of the `-mdef` flag.
230-
231-
232-
If you use this routine in a publication please cite:
233-
234-
> __H. Banos et al.__ (2024) Estimating Linked Exchangeabilities for Profile Mixture Models. _Bioraxiv.
235-
236-
237206
Here, the NEXUS file contains a `models` block to define new models. More explicitly, we define four AA profiles `Fclass1` to `Fclass4`, each containing 20 AA frequencies. Then, the frequency mixture is defined with
238207

239208
FMIX{empirical,Fclass1,Fclass2,Fclass3,Fclass4}
@@ -242,8 +211,7 @@ This means, we have five components: the first corresponds to empirical AA frequ
242211

243212
iqtree -s some_protein.aln -mdef mymodels.nex -m JTT+CF4model+G
244213

245-
The `-mdef` option specifies the NEXUS file containing user-defined models. Here, the `JTT` matrix is applied for all alignment sites and one varies the AA profiles along the alignment. One can use the NEXUS syntax to define all other profile mixture models such as `C10` to `C60`.
246-
214+
The `-mdef` option specifies the NEXUS file containing user-defined models (see below). Here, the `JTT` matrix is applied for all alignment sites and one varies the AA profiles along the alignment. One can use the NEXUS syntax to define all other profile mixture models such as `C10` to `C60`.
247215

248216
### NEXUS model file
249217

doc/Concordance-Factor.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -164,9 +164,9 @@ So, suppose that in the first step of the analysis you ran the command as above:
164164

165165
That command will have figured out for you the model of evolution, all the parameters of that model, and the branch lengths of the corresponding tree. We can re-use all of that useful information in the final step. It just takes a little bit of effort to find what you need.
166166

167-
First we'll get the model parameters we need. If you take a look at the end of the `concat.log` file you will find a little section called `ALISIM COMMAND`. You can find it like this on mac/linux (or just open the `concat.log` file in a text editor and scroll to the end:
167+
First we'll get the model parameters we need. If you take a look at the end of the `concat.iqtree` file you will find a little section called `ALISIM COMMAND`. You can find it like this on mac/linux (or just open the `concat.iqtree` file in a text editor and scroll to the end:
168168

169-
tail concat.log
169+
tail concat.iqtree
170170

171171
You should see something like this:
172172

@@ -189,7 +189,7 @@ To put all of that together, we are going to change the final command of the tut
189189
# compute site concordance factor using likelihood with v2.2.2
190190
iqtree2 -te concat.treefile -s ALN_FILE --scfl 100 --prefix concord2
191191

192-
To one of these, where we add the two extra commands via `-blfix` and `-m`, to fix all the parameters we already calculated. A reminder - do NOT use the exact commandlines above. You have to replace everything after the `-m` with what you found in your own `concat.log` file:
192+
To one of these, where we add the two extra commands via `-blfix` and `-m`, to fix all the parameters we already calculated. A reminder - do NOT use the exact commandlines above. You have to replace everything after the `-m` with what you found in your own `concat.iqtree` file:
193193

194194
# faster analysis, using pre-computed model parameters, with per-locus alignments
195195
# compute site concordance factor using likelihood with v2.2.2

0 commit comments

Comments
 (0)