Skip to content

Don't merge: Test dace gpu cached#2652

Draft
edopao wants to merge 69 commits into
GridTools:mainfrom
edopao:test-dace_gpu_cached
Draft

Don't merge: Test dace gpu cached#2652
edopao wants to merge 69 commits into
GridTools:mainfrom
edopao:test-dace_gpu_cached

Conversation

@edopao

@edopao edopao commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

No description provided.

egparedes and others added 30 commits May 27, 2026 15:25
…g in fingerprinters

- Add `fingerprint` (alias for `sorting_sets_fingerprinter`) and
  `versioned_fingerprint` (includes BUILD_CACHE_VERSION_ID) to `utils.py`
- Fix `skipping_fields_node_fingerprinter` to pass the reducer dict as a
  positional arg (not keyword) to `CustomPicklingFingerprinter.from_reducers`
- Fix `custom_overriden_pickler` in `eve/utils.py` to use `pickle._Pickler`
  (pure Python) instead of the C-extension `pickle.Pickler` so that
  `reducer_override` is called for built-in types like `dict` and `set`
  (the C-extension fast path bypasses `reducer_override` for built-in types)
- Update `test_cached_with_hashing` to use a module-level function instead
  of a lambda (lambdas can't be pickled, now that `cache_key` fingerprints `self`)
- Add tests for `fingerprint`, `versioned_fingerprint`, and `cache_key` behavior

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…on stage classes

Re-add the `fingerprinter` module-level alias (= `semantic_fingerprinter`) and
the `fingerprint` computed property on all four stage dataclasses
(`DSLFieldOperatorDef`, `FOASTOperatorDef`, `DSLProgramDef`, `PASTProgramDef`).
These were removed when the old `FingerprintedABC`/`FingerprintedMixin` system
was dropped in the 'More refactoring' commit but are still needed by existing
tests and callers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Re-add the `fingerprint()` method to the `ir.Node` base class that was
removed along with the `FingerprintedABC`/`FingerprintedMixin` system.

The method is needed by:
- `ffront.lowering_utils` (uses `itir.Expr.fingerprint()` to generate
  unique variable names)
- `ffront.foast_to_gtir` (uses `itir.Expr.fingerprint()` to generate
  unique SymRef names for conditionals)
- `iterator_tests/test_ir.py` (tests that fingerprinting ignores location/type)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
havogt and others added 27 commits June 10, 2026 08:16
…ashing

Replace the pickle-stream content hash (custom picklers + reducer_override
+ MetadataBasedPicklingMixin) with a structural Merkle-style fingerprinter:
an iterative post-order walk that decomposes objects via per-type handlers
and combines child digests bottom-up. Public API (stable_fingerprinter,
semantic_fingerprinter, skipping_fields_node_fingerprinter, cache_key
formula, BUILD_CACHE_VERSION_ID salt) is unchanged.

Fixes:
- RecursionError on deep inputs: the pure-Python pickler exhausted the
  recursion limit at ITIR depth ~74; the iterative walk has no depth limit.
- Local enum classes (e.g. DSL constants in
  test_constant_closure_vars_with_enums) raised TypeError under the
  <locals> guard; enum classes are now fingerprinted by member content.
- Dicts with non-orderable keys (e.g. Dimension) raised TypeError;
  unordered containers now sort the child *digests*, not the values.
- OrderedDicts with different orders collided (false cache hit).
- Object-graph sharing leaked into the hash via the pickle memo
  (value-equal inputs could fingerprint differently).

Cleanup:
- MetadataBasedPicklingMixin and the metadata-based __getstate__ factory
  are removed; field opt-out (gt4py_metadata(fingerprint=False)) is read
  directly from dataclass/datamodel field metadata, so fingerprinting no
  longer alters how classes really pickle.
- eve helpers merge_dispatchers, PurePickler (private pickle._Pickler)
  and pickle_reducer_factory are removed (now unused).
- ADR 0023 rewritten; the pickle-based design is recorded under
  alternatives considered.
…gt4py into fix/add-step-state-to-caches

# Conflicts:
#	src/gt4py/eve/utils.py
#	tests/eve_tests/unit_tests/test_utils.py
…bras

Separate the traversal scheme from the reduction logic:
- TreeLeaf/TreeNode: carrier-agnostic one-level decomposition vocabulary
- tree_cata: generic iterative post-order fold (explicit stack, id-based
  memoization, cycle back references), reusable with any result type
- fingerprinting becomes tree_cata instantiated with xxhash64 digest
  algebras over the per-type decomposition handler registry

Fingerprint values are unchanged (verified byte-identical on samples).
- Resolve the qualified name through sys.modules and require identity with
  the object itself, rejecting shadowed/reassigned/deleted globals that the
  string-based locals guard cannot catch.
- Fingerprint defaultdicts including their default_factory.
NumPy ufuncs (e.g. np.cbrt in DSL closure variables) only pickle through
their copyreg.dispatch_table reducer; calling __reduce_ex__ directly
raises. Consult the dispatch table first, like pickle.Pickler does.
Propagate the rename through docstrings, the tree_cata doctest, tests
and ADR 0023; cache _class_tag with functools.cache.
atom_reducer/composite_reducer/cycle_reducer (formerly leaf_alg/node_alg/
cycle_alg), with matching parameter and prose updates in tests and ADR.
Extractor (formerly ObjectDecomposer), CompositeContent (formerly
ObjectDecomposition), AtomicContent (formerly DecompositionAtom),
AtomicReducer (formerly AtomReducer), with matching extract/atomic_reducer
parameter names and prose updates in docstrings, tests and ADR 0023.
make_fingerprinter() now takes an Extractor instead of a mapping of
per-type overrides; the new make_extractor(overrides) builds extractors
from the default rules, and the module-level 'extract' is the default
Extractor. skipping_fields_node_fingerprinter returns extractor
overrides (return_extractors=) for composition via make_extractor.
Deconstructor (formerly Extractor), Deconstruction (formerly
CompositeContent), EmptyDeconstruction (formerly AtomicContent),
Collapser (formerly AtomicReducer), CompositeCollapser (formerly
CompositeReducer), with matching deconstruct/collapser parameter and
factory names (make_deconstructor, return_deconstructors, ...).

Also reconcile with the reshaped driver API: cycles are now collapsed
by the regular collapser as back-reference EmptyDeconstructions when
allow_cycles=True (the CycleCollapser alias and the fingerprint cycle
collapser are gone); fingerprint digests are unchanged.
Deconstruction becomes the base Collection of deconstructed pieces with
EmptyDeconstruction (terminal, no pieces) and
OrderInsensitiveDeconstruction (pieces collapse in canonical order)
subclasses, replacing the 'ordered' flag; builders (from_pieces,
from_typed_value, from_reference) are the construction API and all call
sites, tests, docstrings and ADR 0023 adopt the pieces vocabulary.
Fingerprint digests are unchanged.
- reduce_object takes a single 'collapser' (renamed param 'deconstructor'):
  composites are collapsed by re-wrapping the already-collapsed piece
  results via dataclasses.replace, and the driver canonicalizes the order
  of OrderInsensitiveDeconstruction results itself.
- The fingerprint digest scheme is unified accordingly
  (xxh64('node' + state + 'pieces' + digests) for empty and composite
  alike), intentionally changing all fingerprint values (no migration
  needed, stale cache keys are simply never hit again).
- make_fingerprinter is replaced by functools.partial composition;
  make_deconstructor gains a 'fallback' parameter and fingerprinters use
  fingerprint_fallback, which layers the gt4py_metadata(fingerprint=False)
  opt-out over the default dataclass/datamodel field deconstruction.
  Dispatching dataclasses/datamodels through virtual ABC registry keys
  (xtyping.DataclassABC / new datamodels.DataModelABC) was attempted but
  breaks stdlib singledispatch MRO composition for eve's generic
  datamodel classes (RuntimeError: Inconsistent hierarchy), so they are
  handled in the fallbacks instead.
- New eve datamodels.DataModelABC (virtual ABC via subclass hook) with
  unit test.
Also fix the import-order NameError reintroduced by moving the public
deconstructor builders above their implementation dependencies.
Address review findings on the structural-fingerprinting refactor:

- Restore dedicated cache key functions (compilation_hash,
  fingerprint_compilable_program) for the gtfn/dace executor and
  persistent translation caches, reimplemented on the new machinery:
  location/type-agnostic program fingerprint, order-sensitive by-id
  offset providers for the in-memory executor (fixes silent wrong
  results from reordered offset_provider dicts and avoids content-hashing
  connectivity tables on every lookup), and location-stable persistent
  keys.
- Fingerprint DSL definition functions by source code only (drop
  filename/line/column) so textually identical operators match.
- Fix DataModelABC.__subclasshook__ to defer non-base checks via
  NotImplemented.
- Rename env var to GT4PY_BUILD_CACHE_VERSION_ID.
- Cache CachedStep self-fingerprint instead of re-walking the step graph
  per lookup.
- Unify the three drifted fields-deconstruction sites.
- Fix WorkflowPatterns.md cached-step example.
…/subclass fingerprinting, perf

Round 2, addressing the automated PR review findings:

- [2/10] Restore location-SENSITIVE frontend-stage cache keys. The PR made
  FOAST/PAST stage keys and DSL-definition keys location-insensitive, so two
  textually identical operators in different files aliased to one cached
  lowering, baking the first's SourceLocations into the second (wrong error
  locations). FOAST/PAST node fingerprints now include 'location'; the DSL
  definition function is fingerprinted by its full SourceDefinition again.
  Location-agnostic fingerprinting remains for the ITIR persistent cache and
  var-name generation. Adds a location-sensitivity regression test.
- [3/10] Fix hard crash at @field_operator decoration when closure vars hold a
  locally-defined FrozenNamespace subclass (icon4py constants idiom): register
  a content-based deconstructor for eve.utils.Namespace instead of falling
  through to __reduce_ex__, which referenced the non-importable local class.
- [7/10] Capture builtin-container *subclass* instance __dict__ state so a dict/
  list/set subclass with extra attributes no longer collides with a plain
  instance (false cache hit). Exact builtins are unaffected.
- [10/10] Cheap fingerprinting-hot-path wins: cache _dataclass_fields per type;
  construct the recombined Deconstruction directly instead of dataclasses.replace.

[1/10],[4/10],[5/10],[6/10] were already addressed in the previous commit.
…fn cache-key crash)

The frontend-stage cache keys fingerprint program-likes (FieldOperator,
Program, ...) when they appear in another program's closure variables. The
PR removed the dedicated FieldOperator/Program fingerprint registrations,
so the whole 'backend' object graph is now traversed. That graph can hold
non-importable objects — e.g. the 'unittest.mock.Mock' backend that
test_compiled_program swaps in, whose dynamically-created 'Mock' subclass
is rejected by the by-qualified-name reference deconstructor — crashing
fingerprinting with TypeError (test_compiled_program gtfn variants, internal
nomesh CI jobs).

Exclude 'backend' from the fingerprint via gt4py_metadata(fingerprint=False):
it does not affect the lowering (which is what these stage caches key), the
backend-specific compilation is keyed separately in the backend's own caches,
and traversing the whole backend graph per cache lookup was wasteful and
fragile. Adds a regression test.
@edopao

edopao commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

cscs-ci run default

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants