Skip to content

Commit d54c954

Browse files
committed
update compression tuning docs
1 parent 51dc522 commit d54c954

2 files changed

Lines changed: 9 additions & 6 deletions

File tree

CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ endif()
1818
FetchContent_Declare(
1919
sqlite_zstd_vfs
2020
GIT_REPOSITORY https://github.com/mlin/sqlite_zstd_vfs.git
21-
GIT_TAG e1624c6
21+
GIT_TAG 5b67de8
2222
)
2323
FetchContent_MakeAvailable(sqlite_zstd_vfs)
2424
include_directories(${sqlite_zstd_vfs_SOURCE_DIR}/src)

docs/guide.md

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -95,13 +95,16 @@ Afterwards, all the usual SQLite3 API operations are available through the retur
9595

9696
The aforementioned tuned settings can be further adjusted. Some bindings (e.g. C/C++) receive these options as the text of a JSON object with keys and values, while others admit individual arguments to the Open routine.
9797

98+
* **threads = -1**: worker thread budget for compression, sort, and prefetching/decompression operations; -1 to match up to 8 host processors.
99+
* **inner_page_KiB = 16**: [SQLite page size](https://www.sqlite.org/pragma.html#pragma_page_size) for new databases, any of {1, 2, 4, 8, 16, 32, 64}. Larger pages are more compressible, but increase random I/O cost.
100+
* **outer_page_KiB = 32**: compression layer page size for new databases, any of {1, 2, 4, 8, 16, 32, 64}. <br/>
101+
The default configuration (inner_page_KiB, outer_page_KiB) = (16,32) balances random access speed and compression. Try setting them to (8,16) to prioritize random access, or (64,2) to prioritize compression <small>(if compressed database will be <4TB)</small>.
102+
* **zstd_level = 6**: Zstandard compression level for newly written data (-7 to 22)
98103
* **unsafe_load = false**: set true to disable write transaction safety (see advice on bulk-loading below). <br/>
99-
**❗ A database opened unsafely is liable to be corrupted if the application fails or crashes.**
104+
**❗ A database written to unsafely is liable to be corrupted if the application crashes, or if there's a concurrent attempt to modify it.**
100105
* **page_cache_MiB = 1024**: database cache size. Use a large cache to avoid repeated decompression in successive and complex queries.
101-
* **threads = -1**: worker thread budget for compression and sort operations; -1 to match up to 8 host processors.
102-
* **zstd_level = 6**: Zstandard compression level for newly written data (-5 to 22)
103-
* **inner_page_KiB = 16**: [SQLite page size](https://www.sqlite.org/pragma.html#pragma_page_size) for new databases, any of {1, 2, 4, 8, 16, 32, 64}. Larger pages are more compressible, but increase random I/O amplification.
104-
* **outer_page_KiB = 32**: compression layer page size for new databases, any of {1, 2, 4, 8, 16, 32, 64}. The default configuration (inner_page_KiB, outer_page_KiB) = (16,32) balances access speed and compression. Try setting them to (8,16) to prioritize access speed, or (64,1) to prioritize compression.
106+
* **immutable = false**: set true to slightly reduce overhead reading from a database file that won't be modified by this or any concurrent program, guaranteed.
107+
* **force_prefetch = false**: set true to enable background prefetching/decompression even if inner_page_KiB &lt; 16 (enabled by default only &ge; that, as it can be counterproductive below; YMMV)
105108

106109
The connection's potential memory usage can usually be budgeted as roughly the page cache size, plus the size of any uncommitted write transaction (unless unsafe_load), plus some safety factor. ❗However, this can *multiply by (threads+1)* during queries whose results are at least that large and must be re-sorted. That includes index creation, when the indexed columns total such size.
107110

0 commit comments

Comments
 (0)