Skip to content

Commit 9fdafa1

Browse files
committed
update README.md
1 parent 3b70046 commit 9fdafa1

3 files changed

Lines changed: 14 additions & 10 deletions

File tree

README.md

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,17 @@
11
# Genomics Extension for SQLite
22

3-
![build](https://github.com/mlin/GenomicSQLite/workflows/build/badge.svg?branch=main)
3+
**("GenomicSQLite")**
44

5-
(**GenomicSQLite** for short!) Adds to the [ubiquitous](https://www.sqlite.org/mostdeployed.html) embedded RDBMS:
5+
This [SQLite3 loadable extension](https://www.sqlite.org/loadext.html) adds features to the [ubiquitous](https://www.sqlite.org/mostdeployed.html) embedded RDBMS supporting applications in genome bioinformatics:
66

77
* genomic range indexing for overlap queries & joins
8-
* streaming storage compression using [Zstandard](https://facebook.github.io/zstd/) (also available [standalone](https://github.com/mlin/sqlite_zstd_vfs))
9-
* pre-tuned settings for "omics" scale datasets
8+
* streaming storage compression (also available [standalone](https://github.com/mlin/sqlite_zstd_vfs))
9+
* pre-tuned settings for "big data"
10+
11+
Notice: this project is not associated with the SQLite developers.
1012

11-
Together, these make SQLite a [viable file format](https://www.sqlite.org/appfileformat.html) for storage, transport, and basic analysis of genomic data.
13+
## Under construction
14+
15+
![build](https://github.com/mlin/GenomicSQLite/workflows/build/badge.svg?branch=main)
1216

13-
Notice: this project is *not* associated with the SQLite developers.
17+
The extension isn't quite ready for general use. The repo is public to facilitate "soft launch" preparations.

loaders/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
# GenomicSQLite example loaders
22

3-
These programs exemplify loading common genomics formats into SQLite databases, with the Genomics Extension providing compression and genomic range indexing. They're used in the extension's test suite, and may perhaps become useful for new applications. Improvement/addition pull requests welcome!
3+
These programs exemplify loading common genomics formats into SQLite databases, with the Genomics Extension providing compression and genomic range indexing. They're used in the extension's test suite, and may perhaps become useful for new applications. Pull requests welcome!
44

5-
* `vcf_into_sqlite:`: loads VCF/gVCF/pVCF into a highly detailed schema representing all fields in SQL columns.
5+
* `vcf_into_sqlite:`: loads VCF/gVCF/pVCF into an exhaustively detailed schema, representing all fields in SQL columns.
66
* `vcf_lines_into_sqlite`: more simply loads VCF with each text line stored alongside bare-essential genomic range columns for indexing.
77
* `sam_into_sqlite`: loads SAM/BAM/CRAM with a main table for the alignment details and cross-referenced tables for QNAME, SEQ, QUAL, & tags.
88

loaders/vcf_into_sqlite.cc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
/*
2-
* vcf_into_sqlite: load VCF/gVCF/pVCF into a GenomicSQLite database with a detailed schema
3-
* unpacking genotypes and all QC fields
2+
* vcf_into_sqlite: load VCF/gVCF/pVCF into a GenomicSQLite database with an exhaustively detailed
3+
* schema, representing all fields in SQL columns.
44
* - if there are individual genotypes:
55
* - they & FORMAT fields go into a separate table keyed by (variant,sample)
66
* - sample names go into a dimension table referred to by integer ID elsewhere

0 commit comments

Comments
 (0)