You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From IQ-TREE 3 onwards, we provide the functionality in IQ-TREE to perform pre-calculation steps required for MCMCtree approximate likelihood dating. IQ-TREE can now generate the Hessian file which is compatible with MCMCtree. You can input this file to MCMCtree and use fast approximate likelihood dating. So, if you use this feature, please cite:
35
+
36
+
> __P. Demotte, M. Panchaksaram, N. Ly-Trong, M. dos Reis and B.Q. Minh__ (2025) IQ2MC: A New Framework to Infer Phylogenetic Time Trees Using IQ-TREE and MCMCtree.
37
+
38
+
IQ2MC workflow for time tree inference
39
+
--------------------------------------
40
+
41
+
The IQ2MC workflow follows three steps that integrate IQ-TREE and MCMCtree. The final output is a time-estimated phylogeny starting from a multiple-sequence alignment as displayed in the following figure.
42
+
43
+
44
+
*`step1`: Given an input multiple sequence alignment, IQ-TREE will infer the maximum likelihood tree using the IQ-TREE tree search algorithm. Note that, the tree estimated here should be a rooted tree or you need to manually root the tree as MCMCtree only accepts rooted trees for phylogenetic dating. In this step, IQ-TREE also estimates the best-fitted substitution model for the data if you do not specify the model. This step is optional if you provide a rooted tree, the MSA, and the substitution model for step 2.
45
+
*`step2`: For fast approximate likelihood dating, MCMC requires the gradients and the Hessian/Hessians of the branch lengths calculated at maximum likelihood estimates. Given the rooted tree with fossil/tip calibrations, the substitution model, and the MSA, IQ-TREE generates the Hessian file containing the gradients and the Hessian/Hessian and all required files to run MCMCtree for dating.
46
+
*`step3`: Now, you can directly run MCMCtree from the IQ-TREE output of step 2 and infer the time tree.
47
+
48
+

49
+
50
+
Estimating the gradients and the Hessian for MCMCtree dating
To obtain the Hessian file for MCMCtree approximate likelihood dating, you need to perform step 2 in the workflow. For this step, IQ-TREE expects a rooted tree file, the substitution model, and the multiple sequence alignment. When `--dating mcmctree` option is used as below, IQ-TREE performs the gradients and the Hessian calculation and generates the `Hessian file`. This Hessian file is compatible with MCMCtree and you can use it as an input to MCMCtree for approximate likelihood dating.
54
+
55
+
If the alignment file is called `example.phy` and the rooted tree file is called `example_tree.nwk`,
Note that, Here we generate the Hessian file for a fixed rooted tree. You can directly input the rooted tree which already contains fossil/tip calibration information added using tree editing tools such as FigTree. When using the above command, IQ-TREE generates the following files which can be used to run MCMCtree for phylogenetic dating.
60
+
61
+
*`example.mcmctree.hessian`: the hessian file which contains the gradients vector and the Hessian for approximate likelihood dating.
62
+
*`example.rooted.nwk` : the rooted tree file which is compatible with the Hessian file. It is necessary to use this tree file with MCMCtree for dating as the Hessian is calculated with respect to the ordering of taxa of this tree file.
63
+
*`example.mcmctree.ctl` : the control file that can be used directly to run MCMCtree from IQ-TREE output of step 2.
64
+
*`example.dummy.aln` : It is not necessary to use the alignment with MCMCtree under approximate likelihood dating. However, in the current format MCMCtree requires an alignment, and you can simply use this dummy alignment file as the input to MCMCtree to save compute.
65
+
66
+
67
+
You can specify more parameters in the workflow to generate the control file accurately for the analysis with IQ-TREE.
*`--mcmc-iter burnin,samplefreq,nsample` : use to set number of burin samples, sample frequency and number of MCMC samples in the control file. In the above example, burnin =20000, samplefreq = 200 and nsample = 50000
72
+
73
+
*`--mcmc-bds birth-rate,death-rate,sampling-fraction`: use to set the parameters for birth-death prior in MCMCtree. In the above example, birth-rate=1, death-rate=1 and sampling-fraction=0.5
74
+
75
+
*`--mcmc-clock <EQUAL|IND|CORR>` : use to set clock model for MCMCtree. Currently supported clocks models are EQUAL: global clock with equal rates, IND: independent rates model with independent rates across lineages and CORR: correlated clock model with auto-correlated rates across the lineages.
76
+
77
+
Using partitions and Mixture models for approximate likelihood dating.
IQ-TREE supports three partition models for approximate likelihood dating. Under the Edge-unlinked (EUL) model, IQ-TREE generates the Hessian file which contains separate gradients and Hessian for each partition. For the Edge-linked (EL) partition model, the Hessian file contains only one gradient vector and a Hessian as branches are shared across partitions.
81
+
82
+
Since IQ-TREE supports RAxML and NEXUS style partitions input file, you can use partitions defined in the following format.
Here, IQ-TREE generates the Hessian file using the `GTR+G4` model for all partitions. If you need to use different models for each partition, you need to create a more flexible NEXUS file like the following.
93
+
94
+
#nexus
95
+
begin sets;
96
+
charset part1 = 1-100;
97
+
charset part2 = 101-450;
98
+
charpartition mine = GTR+G4:part1, HKY:part2;
99
+
end;
100
+
101
+
Here, IQ-TREE uses `GTR+G4` model for partition 1, and `HKY` model for partition 2 respectively. Using `-q` and `-p` options, you can generate the Hessian file which considers `edge-linked equal branch partition models` and `edge-linked proportional branch length models` respectively.
102
+
103
+
IQ-TREE also supports mixture models for the Hessian file generation. You can simply specify DNA or Amino Acid Mixture model as following,
If you are using ModelFinder or MixtureFinder, you need to follow a two-step approach. First, you can estimate the best-fit model for the data using ModelFinder or MixtureFinder. Then, the Hessian file can be generated using `--dating mcmctree` option using the estimated models.
112
+
113
+
How to run MCMCtree
114
+
-------------------
115
+
116
+
You can directly run MCMCtree from the control file generated by IQ-TREE in step 2. The command to run MCMCtree with the control file is,
117
+
118
+
mcmctree example.mcmctree.ctl
119
+
120
+
121
+
122
+
The control file generated by IQ-TREE has the following format. You can simply edit the control file as necessary. For an example you may need to increase burin and sample frequency for MCMC convergence.
123
+
124
+
seed = -1 * The computer’s current time is used when seed < 0.
125
+
seqfile = example.dummy.phy * A dummy alignment only allow to run MCMCtree
126
+
treefile = example.rooted.nwk * Rooted newick tree with annotated fossil/tip dates
127
+
mcmcfile = example.mcmctree.log * MCMC log of parameters that can be examined in Tracer
128
+
outfile = example.mcmctree.out * Output of the summarized results of MCMCtree
129
+
ckpfile = example..mcmctree.ckp * Checkpoint file of MCMCtree
130
+
hessianfile = example.mcmctree.hessian * File with gradient and hessian matrix
131
+
132
+
133
+
checkpoint = 1 * 0: nothing; 1 : save; 2: resume
134
+
ndata = 1 * number of partitions
135
+
seqtype = 0 * 0 : nucleotides; 1: codons; 2: AAs (not required if the approximate likelihood method is used)
136
+
usedata = 2 * 0: sampling from priors with no data; 1: exact slow likelihood; 2: approximate likelihood
137
+
clock = 2 * 1: global clock with equal rates; 2: independent rates; 3: correlated rates
138
+
RootAge = <1.0 * safe constraint on root age, used if no fossil for root in the rooted tree file.
139
+
140
+
BDparas = 1,1,0.5 * birth-rate, death rate, sampling priors for sampling times
141
+
finetune = 1: 0.1 0.1 0.1 0.01 .5 * auto (0 or 1) : times, musigma2, rates, mixing, paras, FossilErr
142
+
print = 1 * 1: normal output; 2: verbose output
143
+
144
+
*** These parameters are used for multi-loci partitioned data (ndata > 1), see dos Reis et al.(2013)
145
+
146
+
rgene_gamma = 2 2 * alpha and beta parameter of Dirichlet-gamma prior for mean rate across loci for clock=2 or 3
147
+
sigma2_gamma = 1 10 * alpha and beta parameter of Dirichlet-gamma prior for rate variance across loci for clock=2 or 3
148
+
149
+
*** These parameters control the MCMC run
150
+
151
+
burnin = 20000
152
+
sampfreq = 200
153
+
nsample = 50000
154
+
155
+
*** Note: Total number of MCMC iterations will be burnin + (sampfreq * nsample)
156
+
157
+
*** The following parameters only needed to run MCMCtree with exact likelihood (usedata = 1), no need to change anything for approximate likelihood (usedata = 2)
158
+
159
+
model = 0 * 0:JC69, 1:K80, 2:F81, 3:F84, 4:HKY85
160
+
alpha = 0 * 0: No rate heterogeneity across sites; otherwise: fixed alpha parameter of the Gamma distribution
161
+
ncatG = 0 * Number of rate categories for the discrete Gamma distribution
162
+
163
+
cleandata = 0 * remove sites with ambiguity data (1:yes, 0:no)?
164
+
165
+
kappa_gamma = 6 2 * gamma prior for kappa of the HKY model
166
+
alpha_gamma = 1 1 * alpha and beta parameter of Gamma distribution for heterogeneous rates across sites
167
+
168
+
Note that, if you generate the `hessain file` from IQ-TREE, it is necessary to use the rooted tree file generated by IQ-TREE to be used in MCMCtree. The `ckpfile` and `hessianfile` options are new and only work for the PAML release in IQ-TREE (https://github.com/iqtree/paml). If you use another MCMCtree version/release, you can simply remove those options from control file and rename the `hessian file` to `in.BV` to run MCMCtree without any errors.
Since IQ-TREE 2.0.3, we integrate the least square dating (LSD2) method to build a time tree when you have date information for tips or ancestral nodes. So if you use this feature please cite:
31
175
32
176
__Thu-Hien To, Matthieu Jung, Samantha Lycett, Olivier Gascuel__ (2016)
0 commit comments