Skip to content

Commit a1c947c

Browse files
committed
BackgroundProducer adaptive sleep
1 parent 218f26d commit a1c947c

2 files changed

Lines changed: 36 additions & 7 deletions

File tree

README.md

Lines changed: 26 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Genomics Extension for SQLite
22

3-
**("GenomicSQLite")**
3+
## ("GenomicSQLite")
44

55
This [SQLite3 loadable extension](https://www.sqlite.org/loadext.html) adds features to the [ubiquitous](https://www.sqlite.org/mostdeployed.html) embedded RDBMS supporting applications in genome bioinformatics:
66

@@ -10,8 +10,32 @@ This [SQLite3 loadable extension](https://www.sqlite.org/loadext.html) adds feat
1010

1111
Notice: this project is not associated with the SQLite developers.
1212

13+
### Use cases
14+
15+
The extension makes SQLite an efficient foundation for:
16+
17+
1. Integrative genomics data warehouse
18+
19+
* BED, GFF/GTF, FASTA, FASTQ, SAM, VCF, ...
20+
21+
* One file, zero administration, portable between platforms and languages
22+
23+
2. Slicing & basic analysis with indexed SQL queries, joins, & aggregations
24+
25+
3. Transactional storage engine for API services, incremental reanalysis, real-time basecalling & metagenomics, ...
26+
27+
4. Experimental new data models, before dedicated storage format & tooling are warranted, if ever.
28+
29+
### Contraindications
30+
31+
1. Huge numerical arrays: see [HDF5](https://www.hdfgroup.org/solutions/hdf5/), [Zarr](https://zarr.readthedocs.io/en/stable/), [Parquet](https://parquet.apache.org/), [Arrow](https://arrow.apache.org/). <small>SQLite's [BLOB I/O](https://www.sqlite.org/c3ref/blob_open.html) leaves the door open for mash-ups!</small>
32+
33+
2. Parallel SQL analytics / OLAP: see [Spark](https://spark.apache.org/), [DuckDB](https://duckdb.org/), many commercial products. <small>Some bases can be covered with a sharding harness for a pool of threads with their own SQLite connections...</small>
34+
35+
3. Streaming: SQLite I/O, while often highly sequential in practice, relies on randomly seeking throughout the database file.
36+
1337
## Under construction
1438

1539
![build](https://github.com/mlin/GenomicSQLite/workflows/build/badge.svg?branch=main)
1640

17-
The extension isn't quite ready for general use. The repo is public to facilitate "soft launch" preparations.
41+
The extension isn't quite ready for general use. The repo is public while we work on packaging and documentation.

loaders/common.hpp

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -169,11 +169,15 @@ template <class Item> class BackgroundProducer {
169169
if (ok) {
170170
p_.store(++p, memory_order_release);
171171
if (p - max(c_.load(memory_order_acquire), 1LL) == R_ - 1) {
172+
// Ring is full; enter semi-busy wait with adaptive sleep. Goal is to keep the
173+
// consumer well-fed, without excessively wasteful busy-loop, using atomics for
174+
// coordination. Underlying assumption: producer is usually faster than the
175+
// consumer, while the ring provides a buffer if it has the occasional hiccup.
172176
auto t_spin = chrono::high_resolution_clock::now();
177+
long long spin = 0;
173178
do {
174-
// assumption -- producer is usually faster than the consumer, and ringsize
175-
// provides a buffer if that's occasionally not the case
176-
this_thread::sleep_for(chrono::nanoseconds(10000));
179+
this_thread::sleep_for(
180+
chrono::nanoseconds(10000 + 990000 * min(spin++, 100LL) / 100));
177181
} while (!stop_.load(memory_order_relaxed) &&
178182
(p - max(c_.load(memory_order_acquire), 1LL) == R_ - 1));
179183
p_blocked_ += chrono::high_resolution_clock::now() - t_spin;
@@ -194,8 +198,8 @@ template <class Item> class BackgroundProducer {
194198

195199
virtual ~BackgroundProducer() { abort(); }
196200

197-
// advance to next item for consumption, return false when item stream has ended successfully,
198-
// or throw an exception.
201+
// advance to next item for consumption & return true, return false when item stream has ended
202+
// with success, or throw an exception.
199203
bool next() {
200204
if (!worker_) {
201205
while (ring_.size() < R_) {
@@ -205,6 +209,7 @@ template <class Item> class BackgroundProducer {
205209
}
206210
long long p = p_.load(memory_order_acquire), c = c_.load(memory_order_relaxed);
207211
if (c == p) {
212+
// Ring is empty; enter busy wait
208213
auto t_spin = chrono::high_resolution_clock::now();
209214
while (c == p && !stop_.load(memory_order_relaxed)) {
210215
this_thread::yield();

0 commit comments

Comments
 (0)