Skip to content

Commit e65d222

Browse files
authored
feat(btree): Add BTree global index reader with async on-demand block loading (#229)
Implement BTree index file format support (block, footer, meta, SST file, var_len encoding) and global index scanner for evaluating predicates against BTree global indexes to produce row ID ranges. Key features: - Async on-demand data block reading via FileRead trait instead of loading entire file into memory - Scanner-level BTreeIndexReader cache for reader reuse across evaluations - AND predicate grouping by field_id to minimize file opens - Between pattern detection (GtEq/Gt + LtEq/Lt merged into single range_query) - Row ID predicate extraction from data predicates into row ranges - Support for point lookup, range, IN, NOT IN, IS NULL, IS NOT NULL queries - Zstd block compression/decompression support - File-level pruning using BTreeIndexMeta (first_key, last_key, has_nulls)
1 parent dd71c58 commit e65d222

28 files changed

Lines changed: 4631 additions & 13 deletions

crates/paimon/Cargo.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,8 @@ pretty_assertions = "1"
5555
serde_avro_fast = { version = "2.0.2", features = ["snappy", "zstandard"] }
5656
indexmap = "2.5.0"
5757
roaring = "0.11"
58+
crc32fast = "1"
59+
zstd = "0.13"
5860
arrow-array = { workspace = true }
5961
arrow-buffer = { workspace = true }
6062
arrow-cast = { workspace = true }

0 commit comments

Comments
 (0)