Skip to content

Commit b52ff23

Browse files
authored
Rename ParallelStoreReader to BlockStoreReader (#44)
* Rename ParallelStoreReader to BlockStoreReader * Rename ParallelStoreReader to BlockStoreReader
1 parent 4d32d0a commit b52ff23

13 files changed

Lines changed: 320 additions & 234 deletions

File tree

docs/api/readers.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
::: obspec_utils.readers.BufferedStoreReader
22
::: obspec_utils.readers.EagerStoreReader
3-
::: obspec_utils.readers.ParallelStoreReader
3+
::: obspec_utils.readers.BlockStoreReader

docs/design/caching.md

Lines changed: 19 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,7 @@ During data access:
7575

7676
**Store-level caching is appropriate here because:**
7777

78-
- Chunks may be re-read by different computations
78+
- Blocks may be re-read by different computations
7979
- Cache should be shared across all consumers of the store
8080
- Lifecycle is independent of any single reader
8181

@@ -102,7 +102,7 @@ reader.close()
102102

103103
**Characteristics:**
104104

105-
- Fetches file using parallel `get_ranges()` for speed
105+
- Fetches file using concurrent `get_ranges()` for speed
106106
- Caches in `BytesIO` buffer
107107
- Cache is isolated to this reader instance
108108
- Memory freed on `close()` or context manager exit
@@ -113,26 +113,26 @@ reader.close()
113113
- Small-to-medium files that fit in memory
114114
- When you'll read most of the file anyway
115115

116-
### Reader-Level: ParallelStoreReader
116+
### Reader-Level: BlockStoreReader
117117

118-
`ParallelStoreReader` uses chunk-based LRU caching:
118+
`BlockStoreReader` uses block-based LRU caching:
119119

120120
```python
121-
from obspec_utils.readers import ParallelStoreReader
121+
from obspec_utils.readers import BlockStoreReader
122122

123-
reader = ParallelStoreReader(
123+
reader = BlockStoreReader(
124124
store, "file.nc",
125-
chunk_size=256 * 1024, # 256 KB chunks
126-
max_cached_chunks=64, # Up to 64 chunks cached
125+
block_size=256 * 1024, # 256 KB blocks
126+
max_cached_blocks=64, # Up to 64 blocks cached
127127
)
128128

129-
# Chunks fetched on demand via get_ranges()
129+
# Blocks fetched on demand via get_ranges()
130130
data = reader.read(1000)
131131
```
132132

133133
**Characteristics:**
134134

135-
- Bounded memory usage: `chunk_size * max_cached_chunks`
135+
- Bounded memory usage: `block_size * max_cached_blocks`
136136
- LRU eviction when cache is full
137137
- Good for sparse/random access patterns
138138

@@ -152,8 +152,8 @@ from obspec_utils.readers import BufferedStoreReader
152152
reader = BufferedStoreReader(store, "file.nc", buffer_size=1024 * 1024)
153153

154154
# Sequential reads benefit from buffering
155-
while chunk := reader.read(4096):
156-
process(chunk)
155+
while block := reader.read(4096):
156+
process(block)
157157
```
158158

159159
**Characteristics:**
@@ -354,8 +354,8 @@ results = dask.compute(*tasks)
354354
|---------------|-------------------|
355355
| Parse HDF5/NetCDF file | `EagerStoreReader` |
356356
| Sequential streaming | `BufferedStoreReader` |
357-
| Sparse random access | `ParallelStoreReader` |
358-
| Unknown pattern, large file | `ParallelStoreReader` |
357+
| Sparse random access | `BlockStoreReader` |
358+
| Unknown pattern, large file | `BlockStoreReader` |
359359
| Small file, repeated access | `EagerStoreReader` |
360360

361361
### Should I use store-level caching?
@@ -372,7 +372,7 @@ results = dask.compute(*tasks)
372372

373373
### SplittingReadableStore
374374

375-
`SplittingReadableStore` accelerates `get()` by splitting large requests into parallel `get_ranges()`:
375+
`SplittingReadableStore` accelerates `get()` by splitting large requests into concurrent `get_ranges()`:
376376

377377
```python
378378
from obspec_utils.wrappers import SplittingReadableStore
@@ -384,25 +384,25 @@ fast_store = SplittingReadableStore(
384384
)
385385
```
386386

387-
This extracts the parallel fetching logic from `EagerStoreReader` into a composable wrapper. It composes naturally with `CachingReadableStore`:
387+
This extracts the concurrent fetching logic from `EagerStoreReader` into a composable wrapper. It composes naturally with `CachingReadableStore`:
388388

389389
```python
390390
from obspec_utils.wrappers import CachingReadableStore, SplittingReadableStore
391391

392-
# Compose: fast parallel fetches + caching
392+
# Compose: fast concurrent fetches + caching
393393
store = S3Store(bucket="my-bucket")
394394
store = SplittingReadableStore(store) # Split large fetches
395395
store = CachingReadableStore(store) # Cache results
396396

397-
# First get(): parallel fetch -> cache
397+
# First get(): concurrent fetch -> cache
398398
# Second get(): served from cache
399399
```
400400

401401
**Characteristics:**
402402

403403
- Only affects `get()` and `get_async()` - range requests pass through unchanged
404404
- Requires `head()` support to determine file size (falls back to single request otherwise)
405-
- Tuned for cloud storage (12 MB chunks, 18 concurrent requests by default)
405+
- Tuned for cloud storage (12 MB blocks, 18 concurrent requests by default)
406406

407407
## Summary
408408

docs/index.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,10 @@ Utilities for interacting with object storage, based on [obspec](https://github.
99
- **`obspec_utils.protocols`**: Minimal protocols ([`ReadableStore`][obspec_utils.protocols.ReadableStore],
1010
[`ReadableFile`][obspec_utils.protocols.ReadableFile]) defining read-only interfaces for object storage access
1111
- **`obspec_utils.readers`**: File-like interfaces ([`BufferedStoreReader`][obspec_utils.readers.BufferedStoreReader],
12-
[`EagerStoreReader`][obspec_utils.readers.EagerStoreReader], [`ParallelStoreReader`][obspec_utils.readers.ParallelStoreReader])
12+
[`EagerStoreReader`][obspec_utils.readers.EagerStoreReader], [`BlockStoreReader`][obspec_utils.readers.BlockStoreReader])
1313
for reading from object stores
1414
- **`obspec_utils.stores`**: Alternative store implementations (e.g., [`AiohttpStore`][obspec_utils.stores.AiohttpStore] for generic HTTP access)
15-
- **`obspec_utils.wrappers`**: Composable store wrappers for caching, tracing, and parallel fetching
15+
- **`obspec_utils.wrappers`**: Composable store wrappers for caching, tracing, and concurrent fetching
1616
- **`obspec_utils.registry`**: [`ObjectStoreRegistry`][obspec_utils.registry.ObjectStoreRegistry] for managing multiple stores and resolving URLs
1717

1818
## Design Philosophy
@@ -84,7 +84,7 @@ data = await store.get_range_async(path, start=0, end=1000)
8484
The file handlers provide file-like interfaces (read, seek, tell) for reading from object stores. They work with **any** [`ReadableStore`][obspec_utils.protocols.ReadableStore] implementation:
8585

8686
```python
87-
from obspec_utils.readers import BufferedStoreReader, EagerStoreReader, ParallelStoreReader
87+
from obspec_utils.readers import BufferedStoreReader, EagerStoreReader, BlockStoreReader
8888

8989
# Works with obstore
9090
from obstore.store import S3Store
@@ -104,9 +104,9 @@ reader.seek(0) # Seek back to start
104104
eager_reader = EagerStoreReader(store, "file.bin")
105105
data = eager_reader.readall()
106106

107-
# Parallel reader uses get_ranges() for efficient multi-chunk fetching with LRU cache
108-
parallel_reader = ParallelStoreReader(store, "file.bin", chunk_size=256*1024)
109-
data = parallel_reader.read(1000)
107+
# Block reader uses get_ranges() for efficient multi-chunk fetching with LRU cache
108+
block_reader = BlockStoreReader(store, "file.bin", chunk_size=256*1024)
109+
data = block_reader.read(1000)
110110
```
111111

112112
## Contributing

src/obspec_utils/obspec.py

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,14 @@
22
33
New code should import from:
44
- `obspec_utils.protocols` for ReadableStore, ReadableFile
5-
- `obspec_utils.readers` for BufferedStoreReader, EagerStoreReader, ParallelStoreReader
5+
- `obspec_utils.readers` for BufferedStoreReader, EagerStoreReader, BlockStoreReader
66
"""
77

88
import warnings
99

1010
from obspec_utils.protocols import ReadableFile, ReadableStore
1111
from obspec_utils.readers import (
12+
BlockStoreReader,
1213
BufferedStoreReader,
1314
EagerStoreReader,
1415
ParallelStoreReader,
@@ -17,15 +18,16 @@
1718
warnings.warn(
1819
"Importing from obspec_utils.obspec is deprecated. "
1920
"Please use 'from obspec_utils.protocols import ReadableStore, ReadableFile' "
20-
"and 'from obspec_utils.readers import BufferedStoreReader, EagerStoreReader, ParallelStoreReader' instead.",
21+
"and 'from obspec_utils.readers import BufferedStoreReader, EagerStoreReader, BlockStoreReader' instead.",
2122
DeprecationWarning,
2223
stacklevel=2,
2324
)
2425

2526
__all__ = [
26-
"ReadableFile",
27-
"ReadableStore",
27+
"BlockStoreReader",
2828
"BufferedStoreReader",
2929
"EagerStoreReader",
3030
"ParallelStoreReader",
31+
"ReadableFile",
32+
"ReadableStore",
3133
]

src/obspec_utils/protocols/_protocols.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ class ReadableFile(Protocol):
6161
6262
The `obspec_utils` readers ([`BufferedStoreReader`][obspec_utils.readers.BufferedStoreReader],
6363
[`EagerStoreReader`][obspec_utils.readers.EagerStoreReader],
64-
[`ParallelStoreReader`][obspec_utils.readers.EagerStoreReader]) all implement this protocol,
64+
[`BlockStoreReader`][obspec_utils.readers.BlockStoreReader]) all implement this protocol,
6565
allowing them to be used interchangeably wherever a [`ReadableFile`][obspec_utils.protocols.ReadableFile] is expected.
6666
6767
!!! Warning

src/obspec_utils/readers/__init__.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,10 @@
66

77
from obspec_utils.readers._buffered import BufferedStoreReader
88
from obspec_utils.readers._eager import EagerStoreReader
9-
from obspec_utils.readers._parallel import ParallelStoreReader
9+
from obspec_utils.readers._block import BlockStoreReader, ParallelStoreReader
1010

1111
__all__ = [
12+
"BlockStoreReader",
1213
"BufferedStoreReader",
1314
"EagerStoreReader",
1415
"ParallelStoreReader",

0 commit comments

Comments
 (0)