|
1 | 1 | --- |
2 | 2 | layout: userdoc |
3 | 3 | title: "Advanced Tutorial" |
4 | | -author: Jana Trifinopoulos, Minh Bui |
5 | | -date: 2021-05-19 |
| 4 | +author: Jana Trifinopoulos, Minh Bui, Qin Liu |
| 5 | +date: 2025-05-15 |
6 | 6 | docid: 4 |
7 | 7 | icon: info-circle |
8 | 8 | doctype: tutorial |
@@ -30,6 +30,8 @@ sections: |
30 | 30 | url: user-defined-substitution-models |
31 | 31 | - name: Inferring site-specific rates |
32 | 32 | url: inferring-site-specific-rates |
| 33 | + - name: Trimming alignment sites by likelihood |
| 34 | + url: trimming-alignment-sites-by-likelihood |
33 | 35 | --- |
34 | 36 |
|
35 | 37 | Advanced tutorial |
@@ -521,30 +523,53 @@ This will print an output file `example.phy.mlrate` that looks like: |
521 | 523 | 10 0.00001 |
522 | 524 |
|
523 | 525 |
|
524 | | -Robust phylogenetics analysis using trimmed log-likelihood method |
525 | | ----------------------- |
| 526 | +Trimming alignment sites by likelihood |
| 527 | +-------------------------------------- |
526 | 528 | <div class="hline"></div> |
527 | 529 |
|
528 | | -Phylogenetic inference can be highly sensitive to fast-evolving, saturated or erroneous sites in a sequence alignment. To address this issue, IQ-TREE implements the `trimmed log-likelihood` method - a robust and dynamic approach that improves tree inference by selectively down-weighting problematic sites. |
| 530 | +Phylogenetic inference can be highly sensitive to fast-evolving, saturated or |
| 531 | +erroneous sites in a sequence alignment. To address this issue, IQ-TREE |
| 532 | +implements the **trimmed log-likelihood** method - a robust and dynamic approach |
| 533 | +that improves tree inference by selectively down-weighting problematic sites. |
529 | 534 |
|
530 | | -This method works by dynamically excluding a user-defined proportion of sites with the lowest log-likelihood values during the tree search. As the search progresses, the likelihood of each site is recalculated at each step using current tree and model parameters. This ensures that site removal is always conditional on the current model, tree topology and branch lengths, avoiding circularity. |
| 535 | +This method works by dynamically excluding a user-defined proportion of sites |
| 536 | +with the lowest log-likelihood values during the tree search. As the search |
| 537 | +progresses, the likelihood of each site is recalculated at each step using |
| 538 | +current tree and model parameters. This ensures that site removal is always |
| 539 | +conditional on the current model, tree topology and branch lengths, avoiding |
| 540 | +circularity. |
531 | 541 |
|
532 | | -To use the trimmed log-likelihood method, please make sure that IQ-TREE version 3.0 or later is installed. In the command-line interface, the method is invoked using the option `--robust-phy`. Although it is referred to here as the trimmed log-likelihood method, IQ-TREE uses the name `--robust-phy` to reflect the broader goal of improving the robustness of phylogenetic inference. |
| 542 | +To use the trimmed log-likelihood method, please make sure that IQ-TREE version |
| 543 | +3.0 or later is installed. In the command-line interface, the method is invoked |
| 544 | +using the option `--robust-phy`. Although it is referred to here as the trimmed |
| 545 | +log-likelihood method, IQ-TREE uses the name `--robust-phy` to reflect the |
| 546 | +broader goal of improving the robustness of phylogenetic inference. |
533 | 547 |
|
534 | | -You can run the trimmed log-likelihood method from the command line by specifying the alignment, a substitution model, and the proportion of sites to retain: |
| 548 | +You can run the trimmed log-likelihood method from the command line by |
| 549 | +specifying the alignment, a substitution model, and the proportion of sites to |
| 550 | +retain: |
535 | 551 |
|
536 | | - iqtree3 -s <MY_ALIGNMENT> --robust-phy <PROPORTION_TO_RETAIN> -m <MODEL> |
| 552 | +``` |
| 553 | +iqtree3 -s <MY_ALIGNMENT> --robust-phy <PROPORTION_TO_RETAIN> -m <MODEL> |
| 554 | +``` |
537 | 555 |
|
538 | | -Additional options are available to assist downstream analysis. For instance, IQ-TREE can write site log-likelihoods to a `.sitelh` file, allowing users to identify the excluded sites by examining their log-likelihood values. |
| 556 | +Additional options are available to assist downstream analysis. For instance, |
| 557 | +IQ-TREE can write site log-likelihoods to a `.sitelh` file, allowing users to |
| 558 | +identify the excluded sites by examining their log-likelihood values. |
539 | 559 |
|
540 | | -For example, for a dataset `data.phy`, if users apply a `JC` model and trim `2%` of sites (i.e., retain `98%` of sites) and wish to generate an output that includes the site log-likelihoods, the corresponding command would be: |
| 560 | +For example, for a dataset `data.phy`, if users apply a `GTR+G` model and trim `2%` |
| 561 | +of sites (i.e., retain `98%` of sites) and wish to generate an output that |
| 562 | +includes the site log-likelihoods, the corresponding command would be: |
541 | 563 |
|
542 | | - iqtree3 -s <data.phy> --robust-phy <0.98> -m <JC> -wsl |
| 564 | +``` |
| 565 | +iqtree3 -s data.phy --robust-phy 0.98 -m GTR+G -wsl |
| 566 | +``` |
543 | 567 |
|
544 | 568 | If you use the trimmed log-likelihood method in a publication, please cite: |
545 | 569 |
|
546 | | -> __Liu, Qin, Bui Quang Minh, Robert Lanfear, Michael A. Charleston, Shane A. Richards, and Barbara R. Holland__ Robust Phylogenetics. _bioRxiv_ (2025): 2025-04. |
547 | | - <https://doi.org/10.1101/2025.04.01.646540> |
| 570 | +> __Q. Liu, B.Q. Minh, R. Lanfear, M.A. Charleston,__ |
| 571 | +> __S.A. Richards, and B.R. Holland__ (2025) Robust Phylogenetics. |
| 572 | +> _bioRxiv_. <https://doi.org/10.1101/2025.04.01.646540> |
548 | 573 |
|
549 | 574 | Where to go from here? |
550 | 575 | ---------------------- |
|
0 commit comments