Skip to content

reference panel chr1 is not fully imported  #32

@roman-tremmel

Description

@roman-tremmel

I downloaded the hg19 refpanel with the paramters All and vcf like

get1KGGRCh37.sh All 20 vcf

then I used the command line to score the GWAS data. But first, the refpanel is imported.

REF=~/PascalX/resource/All.1KG.GRCh37
GENE=~/PascalX/resource/gene_GRCh37.tsv
pascalx  -g False -w 10000 -m 0.05 -n True -p 20 ${GENE} ${REF} ${OUT} genescoring -sh False -cr 0 -cp 1 ${IN}

This command produced All.1KG.GRCh37.chr*.db files for all chromosomes, which then can used for scoring. However for chr1 the following error interrupts the import function after 2 hours. Of note, the same error occurs when using a python script instead of the command line function.

Reference panel data not imported. Trying to import...
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "~/anaconda3/envs/pascal/lib/python3.9/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "~/anaconda3/envs/pascal/lib/python3.9/site-packages/PascalX-0.0.4-py3.9-linux-x86_64.egg/PascalX/refpanel.py", line 270, in _import_reference_thread_vcf
    counter[int(geno[2])] += 1
ValueError: invalid literal for int() with base 10: '|'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "run_pascal.py", line 4, in <module>
    Scorer.load_refpanel("~/PascalX/resource/All.1KG.GRCh37",parallel=10, chrlist=[1])
  File "~/anaconda3/envs/pascal/lib/python3.9/site-packages/PascalX-0.0.4-py3.9-linux-x86_64.egg/PascalX/genescorer.py", line 95, in load_refpanel
    self._ref.set_refpanel(filename=filename,parallel=parallel,keepfile=keepfile,qualityT=qualityT,SNPonly=SNPonly,chrlist=chrlist)
  File "~/anaconda3/envs/pascal/lib/python3.9/site-packages/PascalX-0.0.4-py3.9-linux-x86_64.egg/PascalX/refpanel.py", line 120, in set_refpanel
    self._import_reference(chrs=NF,parallel=parallel,keepfile=keepfile,qualityT=qualityT,SNPonly=SNPonly,regEx=regEx,nobar=nobar)
  File "~/anaconda3/envs/pascal/lib/python3.9/site-packages/PascalX-0.0.4-py3.9-linux-x86_64.egg/PascalX/refpanel.py", line 365, in _import_reference
    r.get()
  File "~/anaconda3/envs/pascal/lib/python3.9/multiprocessing/pool.py", line 771, in get
    raise self._value
ValueError: invalid literal for int() with base 10: '|'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions