@@ -452,236 +452,6 @@ to control the way that LSD2 treats outliers, you can do this:
452452 iqtree -s ALN_FILE --date DATE_FILE --date-options "-e 2"
453453
454454A full list of the options for LSD2 can be obtained by downloading LSD2 and
455- running ` lsd2 -h ` , the output of that command is reproduced here for
456- convenience:
455+ running ` lsd2 -h ` , the output of that command is [ provided here] ( lsd2-help ) for
456+ your convenience.
457457
458- ```
459- LSD: LEAST-SQUARES METHODS TO ESTIMATE RATES AND DATES - v.1.8
460-
461- DESCRIPTION
462- This program estimates the rate and the dates of the input phylogenies given
463- some temporal constraints.
464- It minimizes the square errors of the branch lengths under normal
465- distribution model.
466-
467- SYNOPSIS
468- ./lsd [-i inputFile] [-d inputDateFile] [-o outputFile] [-s sequenceLength]
469- [-g outgroupFile] [-f nbSamplings]
470- OPTIONS
471- -a rootDate
472- To specify the root date if there's any. If the root date is not a
473- number, but a string (ex: 2020-01-10, or b(2019,2020)) then it should
474- be put between the quotes.
475- -b varianceParameter
476- The parameter (between 0 and 1) to compute the variances in option -v. It
477- is the pseudo positive constant to add to the branch lengths
478- when calculating variances, to adjust the dependency of variances to
479- branch lengths. By default b is the maximum between median branch length
480- and 10/seqlength; but it should be adjusted based on how/whether the
481- input tree is relaxed or strict. The smaller it is the more variances
482- would be linear to branch lengths, which is relevant for strict clock.
483- The bigger it is the less effect of branch lengths on variances,
484- which might be better for relaxed clock.
485- -d inputDateFile
486- This options is used to read the name of the input date file which
487- contains temporal constraints of internal nodes
488- or tips. An internal node can be defined either by its label (given in
489- the input tree) or by a subset of tips that have it as
490- the most recent common ancestor (mrca). A date could be a real or a
491- string or format year-month-day.
492- The first line of this file is the number of temporal constraints. A
493- temporal constraint can be fixed date, or a
494- lower bound l(value), or an upper bound u(value), or an interval b(v1,v2)
495- For example, if the input tree has 4 taxa a,b,c,d, and an internal node
496- named n, then following is a possible date file:
497- 6
498- a l(2003.12)
499- b u(2007.07)
500- c 2005
501- d b(2001.2,2007.11)
502- mrca(a,b,c,d) b(2000,2001)
503- n l(2004.3)
504- If this option is omitted, and option -a, -z are also omitted, the
505- program will estimate relative dates by giving T[root]=0 and T[tips]=1.
506- -D outDateFormat
507- Specify output date format: 1 for real, 2 for year-month-day. By default
508- the program will guess the format of input dates and uses it for
509- output dates.
510- -e ZscoreOutlier
511- This option is used to estimate and exclude outlier nodes before dating
512- process.
513- LSD2 normalize the branch residus and decide a node is outlier if its
514- related residus is great than the ZscoreOutlier.
515- A normal value of ZscoreOutliercould be 3, but you can adjust it
516- bigger/smaller depending if you want to have
517- less/more outliers. Note that for now, some functionalities could not be
518- combined with outliers estimation, for example
519- estimating multiple rates, imprecise date constraints.
520- -f samplingNumberCI
521- This option calculates the confidence intervals of the estimated rate and
522- dates. The branch lengths of the esimated
523- tree are sampled samplingNumberCI times to generate a set of simulated
524- trees. To generate simulated lengths
525- for each branch, we use a Poisson distribution whose mean equals to the
526- estimated one multiplied by the sequence length, which is
527- 1000 by default if nothing was specified via option -s. Long sequence
528- length tends to give small confidence intervals. To avoid
529- over-estimate the confidence intervals in the case of very long sequence
530- length but not necessarily strict molecular clock, you
531- could use a smaller sequence length than the actual ones. Confidence
532- intervals are written in the nexus tree with label CI_height,
533- and can be visualzed with Figtree under Node bar feature.
534- -g outgroupFile
535- If your data contain outgroups, then specify the name of the outgroup
536- file here. The program will use the outgroups to root the trees.
537- If you use this combined with options -G, then the outgroups will be
538- removed. The format of this file should be:
539- n
540- OUTGROUP1
541- OUTGROUP2
542- ...
543- OUTGROUPn
544- -F
545- By default without this option, we impose the constraints that the date
546- of every node is equal or smaller then the
547- dates of its descendants, so the running time is quasi-linear. Using this
548- option we ignore this temporal constraints, and
549- the the running time becomes linear, much faster.
550- -h help
551- Print this message.
552- -i inputTreesFile
553- The name of the input trees file. It contains tree(s) in newick format,
554- each tree on one line. Note that the taxa sets of all
555- trees must be the same.
556- -j
557- Verbose mode for output messages.
558- -G
559- Use this option to remove the outgroups (given in option -g) in the
560- estimated tree. If this option is not used, the outgroups
561- will be kept and the root position in estimated on the branch defined by
562- the outgroups.
563- -l nullBlen
564- A branch in the input tree is considered informative if its length is
565- greater this value. By default it is 0.5/seq_length. Only
566- informative branches are forced to be bigger than a minimum branch length
567- (see option -u for more information about this).
568- -m samplingNumberOutlier
569- The number of dated nodes to be sampled when detecting outlier nodes.
570- This should be smaller than the number of dated nodes,
571- and is 10 by default.
572- -n datasetNumber
573- The number of trees that you want to read and analyse.
574- -o outputFile
575- The base name of the output files to write the results and the time-scale
576- trees.
577- -p partitionFile
578- The file that defines the partition of branches into multiple subsets in
579- the case that you know each subset has a different rate.
580- In the partition file, each line contains the name of the group, the
581- prior proportion of the group rate compared to the main rate
582- (selecting an appropriate value for this helps to converge faster), and a
583- list of subtrees whose branches are supposed to have the
584- same substitution rate. All branches that are not assigned to any subtree
585- form a group having another rate.
586- A subtree is defined between {}: its first node corresponds to the root
587- of the subtree, and the following nodes (if there any)
588- correspond to the tips of the subtree. If the first node is a tip label
589- then it takes the mrca of all tips as the root of the subtree.
590- If the tips of the subtree are not defined (so there's only the defined
591- root), then by
592- default this subtree is extended down to the tips of the full tree. For
593- example the input tree is
594- ((A:0.12,D:0.12)n1:0.3,((B:0.3,C:0.5)n2:0.4,(E:0.5,(F:0.2,G:0.3)n3:0.33)
595- n4:0.22)n5:0.2)root;
596- and you have the following partition file:
597- group1 1 {n1} {n5 n4}
598- group2 1 {n3}
599- then there are 3 rates: the first one includes the branches (n1,A),
600- (n1,D), (n5,n4), (n5,n2), (n2,B), (n2,C); the second one
601- includes the branches (n3,F), (n3,G), and the last one includes all the
602- remaining branches. If the internal nodes don't have labels,
603- then they can be defined by mrca of at least two tips, for example n1 is
604- mrca(A,D)
605- -q standardDeviationRelaxedClock
606- This value is involved in calculating confidence intervals to simulate a
607- lognormal relaxed clock. We multiply the simulated branch lengths
608- with a lognormal distribution with mean 1, and standard deviation q. By
609- default q is 0.2. The bigger q is, the more your tree is relaxed
610- and give you bigger confidence intervals.
611- -r rootingMethod
612- This option is used to specify the rooting method to estimate the
613- position of the root for unrooted trees, or
614- re-estimate the root for rooted trees. The principle is to search for the
615- position of the root that minimizes
616- the objective function.
617- Use -r l if your tree is rooted, and you want to re-estimate the root
618- locally around the given root.
619- Use -r a if you want to estimate the root on all branches (ignoring the
620- given root if the tree is rooted).
621- In this case, if the constrained mode is chosen (option -c), method
622- "a" first estimates the root without using the constraints.
623- After that, it uses the constrained mode to improve locally the
624- position of the root around this pre-estimated root.
625- Use -r as if you want to estimate to root using constrained mode on all
626- branches.
627- Use -r k if you want to re-estimate the root position on the same branche
628- of the given root.
629- If combined with option -g, the root will be estimated on the branche
630- defined by the outgroups.
631- -R round_time
632- This value is used to round the minimum branch length of the time scaled
633- tree. The purpose of this is to make the minimum branch length
634- a meaningful time unit, such as day, week, year ... By default this value
635- is 365, so if the input dates are year, the minimum branch
636- length is rounded to day. The rounding formula is round(R*minblen)/R.
637- -s sequenceLength
638- This option is used to specify the sequence length when estimating
639- confidence intervals (option -f). It is used to generate
640- integer branch lengths (number of substitutions) by multiplying this with
641- the estimated branch lengths. By default it is 1000.
642- -S minSupport
643- Together with collapsing internal short branches (see option -l), users
644- can also collapse internal branches having weak support values (if
645- provided in the input tree) by using this option. The program will
646- collapse all internal branches having support <= the specifed value.
647- -t rateLowerBound
648- This option corresponds to the lower bound for the estimating rate. It is
649- 1e-10 by default.
650- -u minBlen
651- By default without this option, lsd2 forces every branch of the time
652- scaled tree to be greater than 1/(seq_length*rate) where rate is
653- an pre-estimated median rate. This value is rounded to the number of days
654- or weeks or years, depending on the rounding parameter -R.
655- By using option -u, the program will not estimate the minimum branch
656- length but use the specified value instead.
657- -U minExBlen
658- Similar to option -u but applies for external branches if specified. If
659- it's not specified then the minimum branch length of external
660- branches is set the same as the one of internal branch.
661- -v variance
662- Use this option to specify the way you want to apply variances for the
663- branch lengths. Variances are used to recompense big errors on
664- long estimated branch lengths. The variance of the branch Bi is Vi =
665- (Bi+b) where b is specified by option -b.
666- If variance=0, then we don't use variance. If variance=1, then LSD uses
667- the input branch lengths to calculate variances.
668- If variance=2, then LSD runs twice where the second time it calculates
669- the variances based on the estimated branch
670- lengths of the first run. By default variance=1.
671- -V
672- Get the actual version.
673- -w givenRte
674- This option is used to specify the name of the file containing the
675- substitution rates.
676- In this case, the program will use the given rates to estimate the dates
677- of the nodes.
678- This file should have the following format
679- RATE1
680- RATE2
681- ...
682- where RATEi is the rate of the tree i in the inputTreesFile.
683- -z tipsDate
684- To specify the tips date if they are all equal. If the tips date is not a
685- number, but a string (ex: 2020-01-10, or b(2019,2020))
686- then it should be put between the quotes.
687- ```
0 commit comments