Skip to content

Commit 5390dc0

Browse files
committed
revise mutation rule
1 parent 18141ac commit 5390dc0

1 file changed

Lines changed: 17 additions & 20 deletions

File tree

doc/AliSim.md

Lines changed: 17 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -558,40 +558,37 @@ This example simulates a new alignment under the Juke-Cantor model from the inpu
558558

559559
Pre-define mutations
560560
----------------------------
561-
AliSim allows users to pre-define mutations that occur at some specific branches along the tree. Those mutations could be generated by running [VGsim](https://github.com/genomics-HSE/vgsim) with the option `-writeMutations`. One can specify those mutations via either a separate file (i.e., using the output file from VGsim) or an input tree file.
561+
AliSim allows users to enfore pre-defined mutations that must occur at some specific nodes of the tree. Those mutations could be, for example, generated by running [VGsim](https://github.com/genomics-HSE/vgsim) with the option `-writeMutations`.
562+
562563

563-
### Pre-define mutations via a separate file
564564
Given a tree file `tree_example.nwk`
565565

566-
(T1:0.2,(T2:0.3,T4:0.1)I1:0.4,T3:0.1);
566+
(T1:0.2,(T2:0.3,T4:0.1)Node5:0.4,T3:0.1);
567567

568-
One can specify some predefined mutations in a separate file `mutations.txt` as in the following.
568+
and a mutations file `mutations.txt` like below:
569569

570-
I1 C39G,T17A,G25C
571-
T2 C25A,A5G
570+
Node5 C39G,T17A,G25C
571+
T2 C25A,A5G
572572

573-
Each line starts with a `<node name>`, followed by a tab `\t`, and ends up with a `<list_of_mutations>`. Mutations in the list are separated by a comma `,`. The above file `mutations.txt` specifies:
573+
Each line starts with a taxon name or an internal node name, followed by whitespace(s), and a comma-separated list of mutations. Each mutation is denoted
574+
by a character state at the parent node, followed by a position number
575+
**starting from index 0** and
576+
the character state to be fixed at the current node. For the above example,
574577

575-
* Three mutations `C39G` (i.e., C is substituted by G at site 39), `T17A`, and `G25C` occur along the branch connecting (the internal) node `I1` and its parent node (i.e., the root node);
576-
* Two mutations `C25A` and `A5G` occur along the branch connecting node `T2` and its parent node (i.e., node `I1').
578+
* AliSim will enforce the 40th, 18th, and 26th positions of
579+
the sequence at internal node `Node5` to be `G`, `A` and `C`.
580+
* The 26th and 6th positions of
581+
the sequence at the taxon `T2` to be `A` and `G`, respectively.
577582

578-
Note that the site indexes, by default, start from 0 (to make AliSim compatible with VGsim's output). One can use the option `--index-from-one` to set the site indexes starting from 1.
583+
> NOTE: Site index starts from 0 to make AliSim compatible with VGsim's output). If you want to start from 1, use the option `--index-from-one`.
579584
580585

581586
The following command
582587

583588
iqtree2 --alisim example_mutations -t tree_example.nwk -m JC --mutation mutations.txt
584589

585-
will simulate an alignment with 4 sequences (i.e., T1, T2, T3, and T4) under the [Jukes-Cantor model](http://doi.org/10.1016/B978-1-4832-3211-9.50009-7) where sites 5, 17, 25, and 39 are substituted according to the above pre-defined mutations.
586-
587-
### Pre-define mutations via the input tree file
588-
Another option to specify mutations is using the input tree. One can specify a list of mutations that occur at each branch using the syntax `[&mutations={<list_of_mutations>}]`. To reproduce the above example (in *Pre-define mutations via a separate file* section), one can specify the tree file `tree_mutations.nwk` as in the following.
589-
590-
(T1:0.2,(T2[&mutations={C25A,A5G}]:0.3,T4:0.1)I1[&mutations={C39G,T17A,G25C}]:0.4,T3:0.1);
591-
592-
Then execute AliSim by:
593-
594-
iqtree2 --alisim example_mutations -t tree_mutations.nwk -m JC
590+
will simulate an alignment with 4 sequences (i.e., T1, T2, T3, and T4) under the [Jukes-Cantor model](http://doi.org/10.1016/B978-1-4832-3211-9.50009-7)
591+
and the above mutation rule.
595592

596593

597594
Parallel sequence simulations

0 commit comments

Comments
 (0)