Skip to content

Commit 6e96152

Browse files
authored
Refactor: improve sub-module organization (#40)
* Refactor: improve sub-module organization * Improve cross-references
1 parent e4cf262 commit 6e96152

40 files changed

Lines changed: 2695 additions & 2398 deletions

docs/api/aiohttp.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
::: obspec_utils.aiohttp.AiohttpStore
1+
::: obspec_utils.stores.AiohttpStore

docs/api/cache.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
::: obspec_utils.cache.CachingReadableStore
1+
::: obspec_utils.wrappers.CachingReadableStore

docs/api/obspec.md

Lines changed: 0 additions & 3 deletions
This file was deleted.

docs/api/protocols.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
::: obspec_utils.protocols.ReadableStore
2+
::: obspec_utils.protocols.ReadableFile

docs/api/readers.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
::: obspec_utils.readers.BufferedStoreReader
2+
::: obspec_utils.readers.EagerStoreReader
3+
::: obspec_utils.readers.ParallelStoreReader

docs/api/splitting.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
::: obspec_utils.splitting.SplittingReadableStore
1+
::: obspec_utils.wrappers.SplittingReadableStore

docs/api/tracing.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
::: obspec_utils.tracing.TracingReadableStore
2-
::: obspec_utils.tracing.RequestTrace
3-
::: obspec_utils.tracing.RequestRecord
1+
::: obspec_utils.wrappers.TracingReadableStore
2+
::: obspec_utils.wrappers.RequestTrace
3+
::: obspec_utils.wrappers.RequestRecord

docs/design/caching.md

Lines changed: 9 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ During data access:
8686
`EagerStoreReader` loads the entire file into memory on construction:
8787

8888
```python
89-
from obspec_utils.obspec import EagerStoreReader
89+
from obspec_utils.readers import EagerStoreReader
9090

9191
# File is fully loaded into memory
9292
reader = EagerStoreReader(store, "file.nc")
@@ -118,7 +118,7 @@ reader.close()
118118
`ParallelStoreReader` uses chunk-based LRU caching:
119119

120120
```python
121-
from obspec_utils.obspec import ParallelStoreReader
121+
from obspec_utils.readers import ParallelStoreReader
122122

123123
reader = ParallelStoreReader(
124124
store, "file.nc",
@@ -147,7 +147,7 @@ data = reader.read(1000)
147147
`BufferedStoreReader` provides read-ahead buffering for sequential access:
148148

149149
```python
150-
from obspec_utils.obspec import BufferedStoreReader
150+
from obspec_utils.readers import BufferedStoreReader
151151

152152
reader = BufferedStoreReader(store, "file.nc", buffer_size=1024 * 1024)
153153

@@ -173,7 +173,7 @@ while chunk := reader.read(4096):
173173
`CachingReadableStore` wraps any store to cache full objects:
174174

175175
```python
176-
from obspec_utils.cache import CachingReadableStore
176+
from obspec_utils.wrappers import CachingReadableStore
177177

178178
# Wrap the store with caching
179179
cached_store = CachingReadableStore(
@@ -241,7 +241,7 @@ For workloads requiring cross-worker cache sharing, consider:
241241

242242
```python
243243
import pickle
244-
from obspec_utils.cache import CachingReadableStore
244+
from obspec_utils.wrappers import CachingReadableStore
245245

246246
# Main process: create and populate cache
247247
cached_store = CachingReadableStore(store, max_size=256 * 1024 * 1024)
@@ -291,7 +291,7 @@ When each worker processes a distinct set of files, per-worker caching works wel
291291

292292
```python
293293
from concurrent.futures import ProcessPoolExecutor
294-
from obspec_utils.cache import CachingReadableStore
294+
from obspec_utils.wrappers import CachingReadableStore
295295

296296
def process_files(cached_store, file_paths):
297297
"""Each worker gets its own cache, processes its own files."""
@@ -327,7 +327,7 @@ With Dask, the cached store is serialized to each worker:
327327
```python
328328
import dask
329329
from dask.distributed import Client
330-
from obspec_utils.cache import CachingReadableStore
330+
from obspec_utils.wrappers import CachingReadableStore
331331

332332
client = Client()
333333

@@ -375,7 +375,7 @@ results = dask.compute(*tasks)
375375
`SplittingReadableStore` accelerates `get()` by splitting large requests into parallel `get_ranges()`:
376376

377377
```python
378-
from obspec_utils.splitting import SplittingReadableStore
378+
from obspec_utils.wrappers import SplittingReadableStore
379379

380380
fast_store = SplittingReadableStore(
381381
store,
@@ -387,8 +387,7 @@ fast_store = SplittingReadableStore(
387387
This extracts the parallel fetching logic from `EagerStoreReader` into a composable wrapper. It composes naturally with `CachingReadableStore`:
388388

389389
```python
390-
from obspec_utils.splitting import SplittingReadableStore
391-
from obspec_utils.cache import CachingReadableStore
390+
from obspec_utils.wrappers import CachingReadableStore, SplittingReadableStore
392391

393392
# Compose: fast parallel fetches + caching
394393
store = S3Store(bucket="my-bucket")

docs/index.md

Lines changed: 20 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -4,27 +4,31 @@ Utilities for interacting with object storage, based on [obspec](https://github.
44

55
## Background
66

7-
`obspec-utils` provides helpful utilities for working with object storage in Python, built on top of obspec and obstore. The library includes:
7+
`obspec-utils` provides helpful utilities for working with object storage in Python, built on top of obspec and obstore. The library is organized into subpackages:
88

9-
1. **ObjectStoreRegistry**: A registry for managing multiple object stores, allowing you to resolve URLs to the appropriate store and path. This is particularly useful when working with datasets that span multiple storage backends or buckets.
10-
11-
2. **ReadableStore Protocol**: A minimal protocol defining the read-only interface required for object storage access. This allows alternative backends (like aiohttp) to be used instead of obstore.
12-
13-
3. **File Handlers**: Wrappers around obstore's file reading capabilities that provide a familiar file-like interface.
9+
- **`obspec_utils.protocols`**: Minimal protocols ([`ReadableStore`][obspec_utils.protocols.ReadableStore],
10+
[`ReadableFile`][obspec_utils.protocols.ReadableFile]) defining read-only interfaces for object storage access
11+
- **`obspec_utils.readers`**: File-like interfaces ([`BufferedStoreReader`][obspec_utils.readers.BufferedStoreReader],
12+
[`EagerStoreReader`][obspec_utils.readers.EagerStoreReader], [`ParallelStoreReader`][obspec_utils.readers.ParallelStoreReader])
13+
for reading from object stores
14+
- **`obspec_utils.stores`**: Alternative store implementations (e.g., [`AiohttpStore`][obspec_utils.stores.AiohttpStore] for generic HTTP access)
15+
- **`obspec_utils.wrappers`**: Composable store wrappers for caching, tracing, and parallel fetching
16+
- **`obspec_utils.registry`**: [`ObjectStoreRegistry`][obspec_utils.registry.ObjectStoreRegistry] for managing multiple stores and resolving URLs
1417

1518
## Design Philosophy
1619

1720
The library is designed around **protocols rather than concrete classes**. The `ObjectStoreRegistry` accepts any object that implements the `ReadableStore` protocol, which means:
1821

19-
- **obstore classes** (S3Store, HTTPStore, GCSStore, etc.) work out of the box
20-
- **Custom implementations** (like the included `AiohttpStore`) can be used as alternatives
22+
- **obstore classes** ([S3Store][obstore.store.S3Store], [HTTPStore][obstore.store.HTTPStore],
23+
[GCSStore][obstore.store.GCSStore], etc.) work out of the box
24+
- **Custom implementations** (like the included [`AiohttpStore`][obspec_utils.stores.AiohttpStore]) can be used as alternatives
2125
- **The Zarr/VirtualiZarr layer doesn't care** which backend you use - it just needs something satisfying the protocol
2226

2327
This is particularly useful when:
2428

25-
- obstore's HTTPStore (designed for WebDAV/S3-like semantics) isn't ideal for your use case
29+
- obstore's [HTTPStore][obstore.store.HTTPStore] (designed for WebDAV/S3-like semantics) isn't ideal for your use case
2630
- You need generic HTTPS access to THREDDS, NASA data servers, or other HTTP endpoints
27-
- You want to use a different HTTP library like aiohttp
31+
- You want to use a different HTTP library like [aiohttp](https://docs.aiohttp.org/en/stable/)
2832

2933
## Getting started
3034

@@ -38,7 +42,7 @@ python -m pip install obspec-utils
3842

3943
### ObjectStoreRegistry
4044

41-
The `ObjectStoreRegistry` allows you to register object stores and resolve URLs to the appropriate store:
45+
The [`ObjectStoreRegistry`][obspec_utils.registry.ObjectStoreRegistry] allows you to register object stores and resolve URLs to the appropriate store:
4246

4347
```python
4448
from obstore.store import S3Store
@@ -55,11 +59,11 @@ store, path = registry.resolve("s3://my-bucket/my-data/file.nc")
5559

5660
### Using Alternative HTTP Backends
5761

58-
For generic HTTPS access where obstore's HTTPStore may not be ideal, you can use the `AiohttpStore`:
62+
For generic HTTPS access where obstore's HTTPStore may not be ideal, you can use the [`AiohttpStore`][obspec_utils.stores.AiohttpStore]:
5963

6064
```python
6165
from obspec_utils.registry import ObjectStoreRegistry
62-
from obspec_utils.aiohttp import AiohttpStore
66+
from obspec_utils.stores import AiohttpStore
6367

6468
# Create an aiohttp-based store for a THREDDS server
6569
store = AiohttpStore(
@@ -77,18 +81,18 @@ data = await store.get_range_async(path, start=0, end=1000)
7781

7882
### File Handlers
7983

80-
The file handlers provide file-like interfaces (read, seek, tell) for reading from object stores. They work with **any** ReadableStore implementation:
84+
The file handlers provide file-like interfaces (read, seek, tell) for reading from object stores. They work with **any** [`ReadableStore`][obspec_utils.readers.ReadableStore] implementation:
8185

8286
```python
83-
from obspec_utils.obspec import BufferedStoreReader, EagerStoreReader, ParallelStoreReader
87+
from obspec_utils.readers import BufferedStoreReader, EagerStoreReader, ParallelStoreReader
8488

8589
# Works with obstore
8690
from obstore.store import S3Store
8791
store = S3Store(bucket="my-bucket")
8892
reader = BufferedStoreReader(store, "path/to/file.bin", buffer_size=1024*1024)
8993

9094
# Also works with AiohttpStore or any ReadableStore
91-
from obspec_utils.aiohttp import AiohttpStore
95+
from obspec_utils.stores import AiohttpStore
9296
store = AiohttpStore("https://example.com/data")
9397
reader = BufferedStoreReader(store, "file.bin")
9498

mkdocs.yml

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -18,13 +18,16 @@ nav:
1818
- "Protocols": "design/protocols.md"
1919
- "Caching": "design/caching.md"
2020
- "API":
21+
- Protocols: "api/protocols.md"
22+
- Readers: "api/readers.md"
23+
- Stores:
24+
- AiohttpStore: "api/aiohttp.md"
25+
- Wrappers:
26+
- Caching: "api/cache.md"
27+
- Splitting: "api/splitting.md"
28+
- Tracing: "api/tracing.md"
29+
- Registry: "api/registry.md"
2130
- Typing: "api/typing.md"
22-
- Aiohttp Store Adapters: "api/aiohttp.md"
23-
- Caching: "api/cache.md"
24-
- Splitting: "api/splitting.md"
25-
- Obspec File Readers: "api/obspec.md"
26-
- Store Registries: "api/registry.md"
27-
- Tracing: "api/tracing.md"
2831

2932
watch:
3033
- src/obspec_utils

0 commit comments

Comments
 (0)