Column-oriented analytics inside SQLite.
Sub-second OLAP queries on local files — aggregations, scans, and time-series workloads at warehouse speed, without a warehouse.
Free managed instance → · Docs · Website · Blog
Data:
Vector ·
Sync ·
Columnar ·
JS
AI:
AI ·
Agent ·
Memory ·
MCP
Need analytics at cloud scale? SQLite-Columnar runs OLAP locally; pair it with SQLite Cloud to persist large datasets, run queries across regions, and share dashboards. Free tier available.
sqlite-columnar brings column-oriented analytics to SQLite as a self-contained
loadable extension. It lets applications keep the operational simplicity of
SQLite while adding a storage and execution path built for analytical scans,
aggregations, and grouped summaries over wide datasets.
It does not patch SQLite's pager, btree, parser, VDBE, or shell. Build it as a normal extension, load it into SQLite, and create columnar virtual tables for the parts of your application that behave more like analytics than OLTP.
Traditional SQLite tables are row-oriented, which is excellent for point lookups, small updates, and transactional application state. Analytical workloads are different: they often read a few columns across many rows, compute aggregates, group by dimensions, and filter by ranges. In those cases, reading entire rows means paying I/O and CPU cost for data the query never uses.
sqlite-columnar stores each column separately, tracks chunk-level metadata,
and provides specialized aggregate helpers that avoid generic row
materialization for common analytical queries.
On the included 10 million row variance benchmark, sqlite-columnar shows
large median speedups over standard row-oriented SQLite for operations that
benefit from columnar layout and precomputed metadata:
sum(v1)withcolumnar_sum: 130,583x fasteravg(v3)withcolumnar_avg: 129,317x fastercount(v1)withcolumnar_count: 125,194x faster- grouped
sumby dimension: 6.04x faster - grouped
countby dimension: 14.13x faster - grouped
sum/avg/countby dimension: 6.42x faster - clustered range filter on
ts: 248.89x faster - clustered range filter plus grouped
sum/avg/count: 273.20x faster
These numbers are workload-specific. They are strongest when queries scan a small subset of columns, use aggregate metadata, group over low-cardinality dimensions, or filter on clustered/range-friendly columns. See BENCHMARK.md for the full dataset, commands, timings, and interpretation.
sqlite-columnar is useful when an embedded application needs analytical
queries without moving data into a separate database server.
Good fits include:
- embedded dashboards over local event, telemetry, or product analytics data
- time-series rollups where queries filter by timestamp ranges
- IoT and edge analytics over wide sensor records
- desktop or mobile apps with local reporting and summary views
- feature stores or ML preprocessing jobs that scan a few feature columns at a time
- audit logs and observability data where users aggregate by service, region, status, or time bucket
- SaaS tenant-local analytics where a single-file SQLite database is still the preferred deployment model
- ETL validation workloads that repeatedly compute counts, sums, min/max, and grouped quality checks
Row-oriented SQLite remains the better default for highly transactional
workloads, point lookups, and frequent single-row updates. sqlite-columnar is
intended for the analytical tables in the same application.
Each columnar virtual table owns shadow tables for rowids, column values, global stats, chunk zone maps, dirty chunks, and table-level metadata.
columnar_analyze() builds the metadata used by specialized analytical
functions. After the initial bootstrap, analyze is incremental: inserts,
updates, and deletes mark touched chunks dirty, and later analyze calls rebuild
only those chunks. If metadata says stats are valid and there are no dirty
chunks, analyze returns immediately.
Range-filtered helpers use chunk min/max summaries to skip rowid ranges that cannot match a filter. Grouped helpers perform hash aggregation in C over only the required column shadow tables.
From this directory:
makeBy default the build uses the bundled sqlite/ directory for sqlite3ext.h.
To build against a different SQLite checkout or amalgamation directory, pass
SQLITE_SRC:
make SQLITE_SRC=/path/to/sqliteThis produces columnar.dylib on macOS or columnar.so on Linux.
Tagged releases build loadable extension binaries for Linux, Linux musl,
macOS, Windows, Android, iOS, and iOS Simulator. macOS release assets are
Developer ID signed and notarized ZIP archives. Other platforms are published
as release archives. Each asset contains the platform binary plus README.md,
API.md, BENCHMARK.md, and GIT_COMMIT. Release assets also include
SHA256SUMS for archive verification.
.load ./columnar
CREATE VIRTUAL TABLE sales USING columnar(
id INTEGER,
region TEXT,
amount REAL
);
INSERT INTO sales VALUES
(1, 'north', 10.0),
(2, 'north', 20.0),
(3, 'south', 5.0);
SELECT columnar_analyze('sales');
SELECT columnar_sum('sales', 'amount');
SELECT k, "sum", "avg", "count"
FROM columnar_group_sum_avg_count('sales', 'region', 'amount')
ORDER BY k;See API.md for the complete SQL API reference with examples for every function and table-valued helper.
The benchmark suite is built against the bundled sqlite/sqlite3.c by default:
make benchmarks
build/columnar-analytics-bench ./columnar 10000000 256BENCHMARK.md documents the main analytical benchmark: schema, dataset shape, load/analyze costs, storage size, query timings, speedups, and the interpretation of where columnar storage helps most. Use it as the baseline for evaluating changes and for understanding which query patterns benefit from this extension.
Use make smoke-bench for small correctness-oriented benchmark runs.
Use make variance-bench to run the repeatable performance-variance suite. It
builds multiple deterministic datasets, warms each query once, repeats each
measurement, verifies row-store/columnar result hashes, and reports median and
p95 timings:
make variance-bench VARIANCE_REPEATS=9 \
VARIANCE_DATASETS="small:10000:64 medium:50000:128 wide:50000:512"make test SQLITE3=/path/to/sqlite3The test target runs the SQL smoke suite plus a native robustness suite. The robustness suite checks rollback and savepoint behavior, simulated process death during an uncommitted transaction, unusual table/column names and mixed SQLite storage classes, and automatic result equivalence between each specialized columnar query helper and the matching ordinary SQLite query.
This project is licensed under the Elastic License 2.0. For production or managed service use, contact SQLite Cloud, Inc for a commercial license.
Don't want to run analytics infrastructure yourself? SQLite Cloud is the managed companion to SQLite-Columnar — local-first analytics with cloud persistence, multi-region replication, and shared dashboards.
SQLite-Columnar is one piece of a larger ecosystem that turns SQLite into a runtime for intelligent, distributed data:
Data layer
- sqlite-vector — ANN vector search inside SQLite
- sqlite-sync — Offline-first CRDT sync across devices
- sqlite-columnar — Column-oriented analytics for OLAP queries (you are here)
- sqlite-js — Custom SQLite functions written in JavaScript
AI layer
- sqlite-ai — On-device LLM inference and embeddings
- sqlite-agent — Autonomous AI agents running inside SQLite
- sqlite-memory — Persistent, searchable memory for agents
- sqlite-mcp — Call MCP tools directly from SQL queries
Managed platform
- SQLite Cloud — Hosted SQLite with sync, auth, edge functions, and analytics. Free tier →
Built by SQLite AI. Questions? Contact us.