Skip to content

Different output between vg snarls and vg deconstruct -a #4867

@AliceLaurent

Description

@AliceLaurent

1. What were you trying to do?
I was trying to compute the snarls on a little graph by running both vg snarls and vg deconstruct -a

2. What did you want to happen?
I thought vg snarls and vg deconstruct -a would output the same snarls

3. What actually happened?
I had two different output, two snarls are "missing" in vg deconstruct output

Output of vg snarls :
{"end": {"node_id": "16"}, "end_self_reachable": true, "start": {"node_id": "1"}, "start_end_reachable": true, "start_self_reachable": true}
{"directed_acyclic_net_graph": true, "end": {"node_id": "15"}, "end_self_reachable": true, "parent": {"end": {"node_id": "16"}, "start": {"node_id": "1"}}, "start": {"node_id": "11"}, "start_end_reachable": true}
{"directed_acyclic_net_graph": true, "end": {"node_id": "8"}, "end_self_reachable": true, "parent": {"end": {"node_id": "16"}, "start": {"node_id": "1"}}, "start": {"node_id": "2"}, "start_end_reachable": true}

Ouput of vg deconstruct :
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT inv
ref 100 >1>16 CTTGACTTAGGCCAATACCTTTTTGTGTCTTGACCCCTGCAAATACGTTAGTGGTTCGTGACCGACTTCAGGGTCCCTGACCT CTCCCAAAGGTATTGGCCTAAGTCAACAAAACACAAACTAACGTATTTGCAGGGGTGTGACGAACCAGGTCAGGGACCCTGAAGTCGT 60 . AC=1;AF=1;AN=1;AT=>1>2>3>4>8>10>9>11>13>14>15>16,>1>2<5<6<4<3<2<7<9<10<12<15<14<13>15>16;NS=1;LV=0;RC=ref;RS=100;RD=183 GT 1

Also got "[vg deconstruct] Using flat processing" in stderr
I had this case for several graph, I would like to understand why if there is a reason.

5. What data and command can the vg dev team use to make the problem happen?

Here is the gfa graph I used to produce this output :

H VN:Z:1.0
S 1 CTGGGTAAACCTGGGTAAACCTGGGTAAACCTGGGTAAACCTGGGTAAACCTGGGTAAACCTGGGTAAACCTGGGTAAACCTGGGTAAACCTGGGTAAAC
S 2 T
S 3 TGACTTAGGC
S 4 CAATACCTTT
S 5 GG
S 6 G
S 7 TTGTGTTTTG
S 8 TTGTGTCTTG
S 9 ATACGTTAGT
S 10 ACCCCTGCAA
S 11 GGTTCGTGAC
S 12 GGTTCGTCAC
S 13 CGACTTCAGG
S 14 GTCCCTGACC
S 15 T
S 16 CGAGTAGGCGCGAGTAGGCGCGAGTAGGCGCGAGTAGGCGCGAGTAGGCGCGAGTAGGCGCGAGTAGGCGCGAGTAGGCGCGAGTAGGCGCGAGTAGGCGCGAGTAGGCG
L 2 + 3 + 0M
L 9 - 10 - 0M
L 12 - 15 - 0M
L 13 + 14 + 0M
L 10 + 9 + 0M
L 4 - 3 - 0M
L 11 + 13 + 0M
L 8 + 10 + 0M
L 3 - 2 - 0M
L 4 + 8 + 0M
L 9 + 11 + 0M
L 15 - 14 - 0M
L 3 + 4 + 0M
L 13 - 15 + 0M
L 5 - 6 - 0M
L 6 - 4 - 0M
L 14 + 15 + 0M
L 10 - 12 - 0M
L 2 - 7 - 0M
L 1 + 2 + 0M
L 14 - 13 - 0M
L 7 - 9 - 0M
L 15 + 16 + 0M
L 2 + 5 - 0M
P ref 1+,2+,3+,4+,8+,10+,9+,11+,13+,14+,15+,16+ ,,,,,,,,,,,
P inv 1+,2+,5-,6-,4-,3-,2-,7-,9-,10-,12-,15-,14-,13-,15+,16+ ,,,,,,,,,,,,,,,

The command I used are :
vg convert -f -W graph.gfa > graph.vg
vg snarls graph.vg > snarl_ouput.snarls
vg view -R snarl_output.snarls > snarls.json
vg deconstruct -a -p ref graph.vg > snarls.vcf (also tried with -r vg_snarls_output.snarls)

6. What does running vg version say?

vg version v1.73.0 "Ducky"
Compiled with g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 on Linux
Linked against libstd++ 20230528
Using HTSlib headers 101990, library 1.19.1-29-g3cfe8769
Built by anovak@courtyard.gi.ucsc.edu

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions