Genome Nexus VEP is a small REST wrapper around the Ensembl Variant Effect Predictor (VEP) command line interface. It exposes the following endpoints to interface with VEP:
GET /vep/human/hgvs/{variant}
POST /vep/human/hgvsEach endpoint expects variant(s) to be in HGVS format. See the implementation here.
Make sure you fave the following installed
- Java version: 21
- Maven version: >= 3.6.3
- Docker
Database mode is the preferred way to use Genome Nexus VEP and provides the same functionality as the public Ensembl REST API.
- Download the core database for the ensembl data version you wish to install. The URL containing the data files should be of the format
https://ftp.ensembl.org/pub/release-XXX/mysql/homo_sapiens_core_XXX_<ASSEMBLY_VERSION>/. - Follow the installation instructions to set up your database.
- Point the VEP at your database in your application properties.
-
Download the SQL
homo_sapiens_core_XXX_<ASSEMBLY_VERSION>SQL files from the Genome Nexus S3 Bucket. -
Make sure you change you
my.cnffile to support a larger packet size[mysqld] # Other configurations.... max_allowed_packet=1G
-
If you make a configuration change, then restart the mysql server
-
Add the data to the database
mysql -u <username> -p homo_sapiens_core_XXX_<ASSEMBLY_VERSION> < homo_sapiens_core_XXX_<ASSEMBLY_VERSION>.sql
Cache mode is intended for users who cannot support the database. However, the functionality of VEP is limited if you choose to use cache mode. You will not be able to annotate variants whose coordinates are non-genomic, and you will not be able to annotate HGVSg inversions and duplications.
- Download the VEP cache file and FASTA file for the ensembl data version you wish to install. Follow Ensembl's installation instructions
- Place both your VEP cache file and the FASTA in the plugin-data directory
- Set the
fasta-filenameproperty in your application properties to the name of the installed FASTA file and setmodeto cache.
- Download the SQLite database corresponding to the data version pointed to by your application properties. The URL containing the database should be of the format
https://ftp.ensembl.org/pub/release-XXX/. - Download the PolyPhen_SIFT Perl Module corresponding to the data version pointed to by your application properties. The URL containing the file should be of the format
https://github.com/Ensembl/VEP_plugins/blob/release/XXX/PolyPhen_SIFT.pm. - Place both your installed database and the PolyPhen_SIFT Perl Module in the plugin-data directory.
- Set the
polyphen-sift-filenameproperty in your application properties to the name of the installed database file.
- Download the prediction score file corresponding to your assembly version (
AlphaMissense_hg19.tsv.gzfor GRCh37 orAlphaMissense_hg38.tsv.gzfor GRCh38). - Place the file in the plugin-data directory.
- Run
tabix -s 1 -b 2 -e 2 -f <PREDICTION_SCORE_FILE>. - Set the
alpha-missense-filenameproperty in your application properties to the name of the installed file (not the generated tabix file).
-
Run
./scripts/init_vep.sh <tag for ensemblorg/ensembl-vep image>to install and run a VEP docker image, specifying the tag you wish to use. This will also generate a script to be used by the application,./scripts/vep, which should not be modified.-
If you want to test the VEP command to see if it's working. Run the following:
./scripts/vep \ --database \ --host=host.docker.internal \ --port=3306 \ --user=<db-username> \ --password=<db-password> \ --fork=1 \ --format=hgvs \ --input_data="7:g.55249071T>C" \ --output_file=STDOUT \ --warning_file=STDERR \ --everything \ --hgvsg \ --no_stats \ --xref_refseq \ --json
-
-
Set your VEP configuration in application-dev.yaml.
- Make sure you have
host.docker.internalset as the VEP host (if running./script/vep)
- Make sure you have
-
Run
mvn spring-boot:run
- Make sure the VEP version is correct in the Dockerfile.
- Set your VEP configuration in application-prod.yaml.
IMPORTANT: Make surevep.versionis the same as the version used in the Dockerfile. This version will be attached to all responses from the server. - Run
docker build <DOCKER_ARGS> .to build the production image.