Skip to content

Commit eccc957

Browse files
authored
Add CLAUDE.md via symlink to AGENTS.md with updated dependencies (#446)
1 parent 3b84263 commit eccc957

3 files changed

Lines changed: 218 additions & 1 deletion

File tree

.github/workflows/xskillscore_testing.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ jobs:
5757
micromamba install numpy==1.24
5858
- name: Run tests
5959
run: |
60-
pytest -n 4 --cov=xskillscore --cov-report=xml --verbose
60+
pytest -n auto --cov=xskillscore --cov-report=xml --verbose
6161
- name: Upload coverage to codecov
6262
uses: codecov/codecov-action@v5
6363
with:

AGENTS.md

Lines changed: 216 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,216 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to AI coding agents when working with code in this repository.
4+
5+
## Project Overview
6+
7+
**xskillscore** is a Python package for computing forecast verification metrics using xarray. It provides both deterministic and probabilistic forecast verification metrics designed to work with multi-dimensional labeled arrays, with support for Dask parallel computing.
8+
9+
Originally developed to parallelize forecast metrics for multi-model-multi-ensemble forecasts in the SubX project.
10+
11+
**Related Projects**: [climpred](https://github.com/pangeo-data/climpred) is a key consumer of xskillscore, providing higher-level prediction skill assessment workflows.
12+
13+
## Development Commands
14+
15+
### Testing
16+
17+
Run full test suite:
18+
```bash
19+
pytest -n auto --cov=xskillscore --cov-report=xml --verbose
20+
```
21+
22+
Run tests for a single file:
23+
```bash
24+
pytest xskillscore/tests/test_deterministic.py
25+
```
26+
27+
Run a specific test:
28+
```bash
29+
pytest xskillscore/tests/test_deterministic.py::test_pearson_r -v
30+
```
31+
32+
Run tests with specific markers:
33+
```bash
34+
pytest -m "not slow" # Skip slow tests
35+
pytest -m "not network" # Skip tests requiring network
36+
```
37+
38+
### Doctests
39+
40+
Run doctests on all modules:
41+
```bash
42+
python -m pytest --doctest-modules xskillscore --ignore xskillscore/tests
43+
```
44+
45+
### Code Quality
46+
47+
Run pre-commit checks:
48+
```bash
49+
pre-commit run --all-files
50+
```
51+
52+
Linting and formatting (via ruff):
53+
```bash
54+
ruff check --fix .
55+
ruff format .
56+
```
57+
58+
Type checking:
59+
```bash
60+
mypy xskillscore
61+
```
62+
63+
### Documentation
64+
65+
Build documentation:
66+
```bash
67+
cd docs
68+
make html
69+
```
70+
71+
Test notebooks in documentation:
72+
```bash
73+
cd docs
74+
nbstripout source/*.ipynb
75+
make -j4 html
76+
```
77+
78+
### Installation
79+
80+
Install in development mode:
81+
```bash
82+
pip install -e .
83+
```
84+
85+
Install with test dependencies:
86+
```bash
87+
pip install -e ".[test]"
88+
```
89+
90+
Install with all dependencies:
91+
```bash
92+
pip install -e ".[complete]"
93+
```
94+
95+
## Architecture
96+
97+
### Core Module Structure
98+
99+
The `xskillscore/core/` directory contains the main implementation:
100+
101+
- **deterministic.py**: Deterministic forecast metrics (pearson_r, rmse, mae, mse, etc.)
102+
- **probabilistic.py**: Probabilistic metrics (crps_*, brier_score, rps, rank_histogram, etc.)
103+
- **comparative.py**: Comparative tests (sign_test, halfwidth_ci_test)
104+
- **stattests.py**: Statistical tests (multipletests)
105+
- **contingency.py**: Contingency table class and categorical metrics
106+
- **resampling.py**: Resampling and bootstrapping utilities
107+
- **accessor.py**: xarray accessor (`ds.xs.metric()`) for convenient API
108+
- **utils.py**: Shared utilities for preprocessing dimensions, weights, and broadcasting
109+
- **np_deterministic.py**: NumPy implementations of deterministic metrics
110+
- **np_probabilistic.py**: NumPy implementations of probabilistic metrics
111+
- **types.py**: Type definitions
112+
113+
### Key Design Patterns
114+
115+
1. **xarray.apply_ufunc Pattern**: All metrics use `xr.apply_ufunc` to:
116+
- Apply NumPy implementations to xarray objects
117+
- Handle broadcasting automatically
118+
- Enable Dask parallelization with `dask="parallelized"`
119+
- Preserve attributes with `keep_attrs` parameter
120+
121+
2. **Dimension Preprocessing**: Metrics follow this pattern:
122+
```python
123+
dim, axis = _preprocess_dims(dim, a) # Convert dim to list and axis tuple
124+
a, b = xr.broadcast(a, b, exclude=dim) # Broadcast arrays
125+
a, b, new_dim, weights = _stack_input_if_needed(a, b, dim, weights) # Stack multi-dims
126+
weights = _preprocess_weights(a, dim, new_dim, weights) # Normalize weights
127+
```
128+
129+
3. **Separation of xarray and NumPy logic**:
130+
- High-level functions in `deterministic.py`/`probabilistic.py` handle xarray objects
131+
- Low-level functions in `np_deterministic.py`/`np_probabilistic.py` contain pure NumPy logic
132+
- This enables easier testing and reuse
133+
134+
4. **Optional Weights**: Most metrics support optional `weights` parameter matching the dimensions being reduced.
135+
136+
5. **Member Dimension Convention**: Probabilistic metrics use `member_dim="member"` by default for ensemble dimensions.
137+
138+
### xarray Accessor
139+
140+
Users can access metrics via the `.xs` accessor on xarray Datasets:
141+
```python
142+
ds = xr.Dataset({"a": a_dataarray, "b": b_dataarray})
143+
result = ds.xs.pearson_r("a", "b", dim="time")
144+
```
145+
146+
The accessor handles converting string variable names to actual DataArrays.
147+
148+
### Testing Infrastructure
149+
150+
- **conftest.py**: Centralized pytest fixtures for test data (times, lats, lons, members, etc.)
151+
- Test fixtures provide consistent test data across test modules
152+
- Fixtures include regular data, NaN-masked data, dask-chunked data, and 1D timeseries
153+
- Use `np.random.seed(42)` in doctests for deterministic examples
154+
155+
## Important Considerations
156+
157+
### Temporal Metrics
158+
159+
Some metrics are specifically designed for temporal dimensions:
160+
- `effective_sample_size()`, `pearson_r_eff_p_value()`, `spearman_r_eff_p_value()`
161+
- These raise warnings if applied to non-"time" dimensions
162+
- They account for autocorrelation and should only be used on time series
163+
164+
### NumPy Version Compatibility
165+
166+
The codebase supports both numpy<2.0 and numpy>=2.0. When using NumPy functions:
167+
- Use try/except for imports that changed between versions
168+
- Example: `trapezoid` (new) vs `trapz` (old)
169+
170+
### Dimension Handling
171+
172+
- `dim=None` means reduce over all dimensions
173+
- `dim` can be a string or list of strings
174+
- When multiple dimensions are provided, they are stacked into a single dimension internally
175+
- The `member` dimension in probabilistic forecasts is special and should not be included in `dim`
176+
177+
### NaN Handling
178+
179+
- Most metrics support `skipna` parameter (default: False)
180+
- Probabilistic metrics use `_keep_nans_masked()` to preserve NaN patterns from inputs
181+
182+
### Dask Support
183+
184+
All metrics support Dask arrays via `dask="parallelized"` in `xr.apply_ufunc`. No special handling needed when adding new metrics.
185+
186+
## Python Support
187+
188+
- Minimum Python version: 3.9
189+
- Supported versions: 3.9, 3.10, 3.11, 3.12, 3.13
190+
191+
## Key Dependencies
192+
193+
- xarray >= 2023.4.0 (core data structure)
194+
- numpy >= 1.25
195+
- scipy >= 1.10
196+
- dask[array] >= 2023.4.0 (parallel computing)
197+
- properscoring (probabilistic metrics)
198+
- xhistogram >= 0.3.2 (histogram computations)
199+
- statsmodels (statistical tests)
200+
201+
Optional acceleration:
202+
- bottleneck (faster NaN operations)
203+
- numba >= 0.57 (JIT compilation)
204+
205+
## Contributing Workflow
206+
207+
1. Create a new branch for your feature
208+
2. Make changes and add tests in `xskillscore/tests/`
209+
3. Add docstring examples (they are tested via doctest)
210+
4. Run `pre-commit run --all-files` before committing
211+
5. Ensure tests pass: `pytest -n auto`
212+
6. Ensure doctests pass: `python -m pytest --doctest-modules xskillscore --ignore xskillscore/tests`
213+
7. Update CHANGELOG.rst if appropriate
214+
8. Submit PR to main branch
215+
216+
Note: CI includes tests on multiple Python versions, doctest validation, and notebook execution in docs.

CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
AGENTS.md

0 commit comments

Comments
 (0)