Don't merge: Test dace gpu cached#2652
Draft
edopao wants to merge 69 commits into
Draft
Conversation
…g in fingerprinters - Add `fingerprint` (alias for `sorting_sets_fingerprinter`) and `versioned_fingerprint` (includes BUILD_CACHE_VERSION_ID) to `utils.py` - Fix `skipping_fields_node_fingerprinter` to pass the reducer dict as a positional arg (not keyword) to `CustomPicklingFingerprinter.from_reducers` - Fix `custom_overriden_pickler` in `eve/utils.py` to use `pickle._Pickler` (pure Python) instead of the C-extension `pickle.Pickler` so that `reducer_override` is called for built-in types like `dict` and `set` (the C-extension fast path bypasses `reducer_override` for built-in types) - Update `test_cached_with_hashing` to use a module-level function instead of a lambda (lambdas can't be pickled, now that `cache_key` fingerprints `self`) - Add tests for `fingerprint`, `versioned_fingerprint`, and `cache_key` behavior Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…on stage classes Re-add the `fingerprinter` module-level alias (= `semantic_fingerprinter`) and the `fingerprint` computed property on all four stage dataclasses (`DSLFieldOperatorDef`, `FOASTOperatorDef`, `DSLProgramDef`, `PASTProgramDef`). These were removed when the old `FingerprintedABC`/`FingerprintedMixin` system was dropped in the 'More refactoring' commit but are still needed by existing tests and callers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Re-add the `fingerprint()` method to the `ir.Node` base class that was removed along with the `FingerprintedABC`/`FingerprintedMixin` system. The method is needed by: - `ffront.lowering_utils` (uses `itir.Expr.fingerprint()` to generate unique variable names) - `ffront.foast_to_gtir` (uses `itir.Expr.fingerprint()` to generate unique SymRef names for conditionals) - `iterator_tests/test_ir.py` (tests that fingerprinting ignores location/type) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ashing Replace the pickle-stream content hash (custom picklers + reducer_override + MetadataBasedPicklingMixin) with a structural Merkle-style fingerprinter: an iterative post-order walk that decomposes objects via per-type handlers and combines child digests bottom-up. Public API (stable_fingerprinter, semantic_fingerprinter, skipping_fields_node_fingerprinter, cache_key formula, BUILD_CACHE_VERSION_ID salt) is unchanged. Fixes: - RecursionError on deep inputs: the pure-Python pickler exhausted the recursion limit at ITIR depth ~74; the iterative walk has no depth limit. - Local enum classes (e.g. DSL constants in test_constant_closure_vars_with_enums) raised TypeError under the <locals> guard; enum classes are now fingerprinted by member content. - Dicts with non-orderable keys (e.g. Dimension) raised TypeError; unordered containers now sort the child *digests*, not the values. - OrderedDicts with different orders collided (false cache hit). - Object-graph sharing leaked into the hash via the pickle memo (value-equal inputs could fingerprint differently). Cleanup: - MetadataBasedPicklingMixin and the metadata-based __getstate__ factory are removed; field opt-out (gt4py_metadata(fingerprint=False)) is read directly from dataclass/datamodel field metadata, so fingerprinting no longer alters how classes really pickle. - eve helpers merge_dispatchers, PurePickler (private pickle._Pickler) and pickle_reducer_factory are removed (now unused). - ADR 0023 rewritten; the pickle-based design is recorded under alternatives considered.
…gt4py into fix/add-step-state-to-caches # Conflicts: # src/gt4py/eve/utils.py # tests/eve_tests/unit_tests/test_utils.py
…bras Separate the traversal scheme from the reduction logic: - TreeLeaf/TreeNode: carrier-agnostic one-level decomposition vocabulary - tree_cata: generic iterative post-order fold (explicit stack, id-based memoization, cycle back references), reusable with any result type - fingerprinting becomes tree_cata instantiated with xxhash64 digest algebras over the per-type decomposition handler registry Fingerprint values are unchanged (verified byte-identical on samples).
- Resolve the qualified name through sys.modules and require identity with the object itself, rejecting shadowed/reassigned/deleted globals that the string-based locals guard cannot catch. - Fingerprint defaultdicts including their default_factory.
NumPy ufuncs (e.g. np.cbrt in DSL closure variables) only pickle through their copyreg.dispatch_table reducer; calling __reduce_ex__ directly raises. Consult the dispatch table first, like pickle.Pickler does.
Propagate the rename through docstrings, the tree_cata doctest, tests and ADR 0023; cache _class_tag with functools.cache.
atom_reducer/composite_reducer/cycle_reducer (formerly leaf_alg/node_alg/ cycle_alg), with matching parameter and prose updates in tests and ADR.
Extractor (formerly ObjectDecomposer), CompositeContent (formerly ObjectDecomposition), AtomicContent (formerly DecompositionAtom), AtomicReducer (formerly AtomReducer), with matching extract/atomic_reducer parameter names and prose updates in docstrings, tests and ADR 0023.
make_fingerprinter() now takes an Extractor instead of a mapping of per-type overrides; the new make_extractor(overrides) builds extractors from the default rules, and the module-level 'extract' is the default Extractor. skipping_fields_node_fingerprinter returns extractor overrides (return_extractors=) for composition via make_extractor.
Deconstructor (formerly Extractor), Deconstruction (formerly CompositeContent), EmptyDeconstruction (formerly AtomicContent), Collapser (formerly AtomicReducer), CompositeCollapser (formerly CompositeReducer), with matching deconstruct/collapser parameter and factory names (make_deconstructor, return_deconstructors, ...). Also reconcile with the reshaped driver API: cycles are now collapsed by the regular collapser as back-reference EmptyDeconstructions when allow_cycles=True (the CycleCollapser alias and the fingerprint cycle collapser are gone); fingerprint digests are unchanged.
Deconstruction becomes the base Collection of deconstructed pieces with EmptyDeconstruction (terminal, no pieces) and OrderInsensitiveDeconstruction (pieces collapse in canonical order) subclasses, replacing the 'ordered' flag; builders (from_pieces, from_typed_value, from_reference) are the construction API and all call sites, tests, docstrings and ADR 0023 adopt the pieces vocabulary. Fingerprint digests are unchanged.
- reduce_object takes a single 'collapser' (renamed param 'deconstructor'):
composites are collapsed by re-wrapping the already-collapsed piece
results via dataclasses.replace, and the driver canonicalizes the order
of OrderInsensitiveDeconstruction results itself.
- The fingerprint digest scheme is unified accordingly
(xxh64('node' + state + 'pieces' + digests) for empty and composite
alike), intentionally changing all fingerprint values (no migration
needed, stale cache keys are simply never hit again).
- make_fingerprinter is replaced by functools.partial composition;
make_deconstructor gains a 'fallback' parameter and fingerprinters use
fingerprint_fallback, which layers the gt4py_metadata(fingerprint=False)
opt-out over the default dataclass/datamodel field deconstruction.
Dispatching dataclasses/datamodels through virtual ABC registry keys
(xtyping.DataclassABC / new datamodels.DataModelABC) was attempted but
breaks stdlib singledispatch MRO composition for eve's generic
datamodel classes (RuntimeError: Inconsistent hierarchy), so they are
handled in the fallbacks instead.
- New eve datamodels.DataModelABC (virtual ABC via subclass hook) with
unit test.
Also fix the import-order NameError reintroduced by moving the public deconstructor builders above their implementation dependencies.
Address review findings on the structural-fingerprinting refactor: - Restore dedicated cache key functions (compilation_hash, fingerprint_compilable_program) for the gtfn/dace executor and persistent translation caches, reimplemented on the new machinery: location/type-agnostic program fingerprint, order-sensitive by-id offset providers for the in-memory executor (fixes silent wrong results from reordered offset_provider dicts and avoids content-hashing connectivity tables on every lookup), and location-stable persistent keys. - Fingerprint DSL definition functions by source code only (drop filename/line/column) so textually identical operators match. - Fix DataModelABC.__subclasshook__ to defer non-base checks via NotImplemented. - Rename env var to GT4PY_BUILD_CACHE_VERSION_ID. - Cache CachedStep self-fingerprint instead of re-walking the step graph per lookup. - Unify the three drifted fields-deconstruction sites. - Fix WorkflowPatterns.md cached-step example.
…/subclass fingerprinting, perf Round 2, addressing the automated PR review findings: - [2/10] Restore location-SENSITIVE frontend-stage cache keys. The PR made FOAST/PAST stage keys and DSL-definition keys location-insensitive, so two textually identical operators in different files aliased to one cached lowering, baking the first's SourceLocations into the second (wrong error locations). FOAST/PAST node fingerprints now include 'location'; the DSL definition function is fingerprinted by its full SourceDefinition again. Location-agnostic fingerprinting remains for the ITIR persistent cache and var-name generation. Adds a location-sensitivity regression test. - [3/10] Fix hard crash at @field_operator decoration when closure vars hold a locally-defined FrozenNamespace subclass (icon4py constants idiom): register a content-based deconstructor for eve.utils.Namespace instead of falling through to __reduce_ex__, which referenced the non-importable local class. - [7/10] Capture builtin-container *subclass* instance __dict__ state so a dict/ list/set subclass with extra attributes no longer collides with a plain instance (false cache hit). Exact builtins are unaffected. - [10/10] Cheap fingerprinting-hot-path wins: cache _dataclass_fields per type; construct the recombined Deconstruction directly instead of dataclasses.replace. [1/10],[4/10],[5/10],[6/10] were already addressed in the previous commit.
…fn cache-key crash) The frontend-stage cache keys fingerprint program-likes (FieldOperator, Program, ...) when they appear in another program's closure variables. The PR removed the dedicated FieldOperator/Program fingerprint registrations, so the whole 'backend' object graph is now traversed. That graph can hold non-importable objects — e.g. the 'unittest.mock.Mock' backend that test_compiled_program swaps in, whose dynamically-created 'Mock' subclass is rejected by the by-qualified-name reference deconstructor — crashing fingerprinting with TypeError (test_compiled_program gtfn variants, internal nomesh CI jobs). Exclude 'backend' from the fingerprint via gt4py_metadata(fingerprint=False): it does not affect the lowering (which is what these stage caches key), the backend-specific compilation is keyed separately in the backend's own caches, and traversing the whole backend graph per cache lookup was wasteful and fragile. Adds a regression test.
…cata' into test-dace_gpu_cached
Contributor
Author
|
cscs-ci run default |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.