You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/AliSim.md
+17-20Lines changed: 17 additions & 20 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -558,40 +558,37 @@ This example simulates a new alignment under the Juke-Cantor model from the inpu
558
558
559
559
Pre-define mutations
560
560
----------------------------
561
-
AliSim allows users to pre-define mutations that occur at some specific branches along the tree. Those mutations could be generated by running [VGsim](https://github.com/genomics-HSE/vgsim) with the option `-writeMutations`. One can specify those mutations via either a separate file (i.e., using the output file from VGsim) or an input tree file.
561
+
AliSim allows users to enfore pre-defined mutations that must occur at some specific nodes of the tree. Those mutations could be, for example, generated by running [VGsim](https://github.com/genomics-HSE/vgsim) with the option `-writeMutations`.
562
+
562
563
563
-
### Pre-define mutations via a separate file
564
564
Given a tree file `tree_example.nwk`
565
565
566
-
(T1:0.2,(T2:0.3,T4:0.1)I1:0.4,T3:0.1);
566
+
(T1:0.2,(T2:0.3,T4:0.1)Node5:0.4,T3:0.1);
567
567
568
-
One can specify some predefined mutations in a separate file `mutations.txt`as in the following.
568
+
and a mutations file `mutations.txt`like below:
569
569
570
-
I1 C39G,T17A,G25C
571
-
T2 C25A,A5G
570
+
Node5 C39G,T17A,G25C
571
+
T2 C25A,A5G
572
572
573
-
Each line starts with a `<node name>`, followed by a tab `\t`, and ends up with a `<list_of_mutations>`. Mutations in the list are separated by a comma `,`. The above file `mutations.txt` specifies:
573
+
Each line starts with a taxon name or an internal node name, followed by whitespace(s), and a comma-separated list of mutations. Each mutation is denoted
574
+
by a character state at the parent node, followed by a position number
575
+
**starting from index 0** and
576
+
the character state to be fixed at the current node. For the above example,
574
577
575
-
* Three mutations `C39G` (i.e., C is substituted by G at site 39), `T17A`, and `G25C` occur along the branch connecting (the internal) node `I1` and its parent node (i.e., the root node);
576
-
* Two mutations `C25A` and `A5G` occur along the branch connecting node `T2` and its parent node (i.e., node `I1').
578
+
* AliSim will enforce the 40th, 18th, and 26th positions of
579
+
the sequence at internal node `Node5` to be `G`, `A` and `C`.
580
+
* The 26th and 6th positions of
581
+
the sequence at the taxon `T2` to be `A` and `G`, respectively.
577
582
578
-
Note that the site indexes, by default, start from 0 (to make AliSim compatible with VGsim's output). One can use the option `--index-from-one` to set the site indexes starting from 1.
583
+
> NOTE: Site index starts from 0 to make AliSim compatible with VGsim's output). If you want to start from 1, use the option `--index-from-one`.
will simulate an alignment with 4 sequences (i.e., T1, T2, T3, and T4) under the [Jukes-Cantor model](http://doi.org/10.1016/B978-1-4832-3211-9.50009-7) where sites 5, 17, 25, and 39 are substituted according to the above pre-defined mutations.
586
-
587
-
### Pre-define mutations via the input tree file
588
-
Another option to specify mutations is using the input tree. One can specify a list of mutations that occur at each branch using the syntax `[&mutations={<list_of_mutations>}]`. To reproduce the above example (in *Pre-define mutations via a separate file* section), one can specify the tree file `tree_mutations.nwk` as in the following.
will simulate an alignment with 4 sequences (i.e., T1, T2, T3, and T4) under the [Jukes-Cantor model](http://doi.org/10.1016/B978-1-4832-3211-9.50009-7)
0 commit comments