Conversation
LayoutId mirrors the sealed EncodingId shape — WellKnown constants (FLAT, CHUNKED, STRUCT, ZONED, STATS, DICT) plus Custom — because layouts are runtime-pluggable in the Rust reference (two separate footer spec namespaces sharing the string wire form; vortex.flat is layout-only). Layout's misnamed String encodingId component becomes LayoutId layoutId; unknown layouts still fail loudly (Rust default, no allowUnknown for layouts), now with a typed id in the error. Compat fix uncovered by the reference check: Rust renamed the zone-map layout id to vortex.zoned, keeping vortex.stats as legacy alias — the reader now routes BOTH through the zoned path, so files from current Rust writers scan and prune correctly. The writer keeps emitting vortex.stats, which old and new Rust readers accept; integration oracle confirms byte-identical output. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Layout and ZonedStatsSchema get their own package, mirroring reader.decode on the encoding side — and giving the future LayoutDecoder SPI a landing zone. FlatSegmentDecoder stays in the reader root: its only callers live there and moving it would force it back to public. Pitest targetClasses updated for the new Layout FQN. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
String encodingId predated the sealed EncodingId — a closed enum could not represent an unknown id, so the raw string was the only option. Now the component is typed: a Custom, or a WellKnown whose decoder is not registered. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
LayoutDecoder + LayoutRegistry in reader.layout mirror the ReadRegistry idiom (builder final-freeze, string-keyed dispatch, duplicate registration throws, no service file — programmatic registration like ExtensionDecoder). The four built-ins move out of ScanIterator verbatim: Flat, Chunked, Zoned (claims both the canonical vortex.zoned and legacy vortex.stats ids via the layoutIds() set), Dict. ScanIterator.decodeLayout is now one registry call; zone-map pruning and chunk planning keep inspecting built-ins only — the SPI covers full-column subtree decode. Wired end-to-end per the no-decorative-flags rule: VortexHandle gains layoutRegistry(), both readers take open(..., LayoutRegistry) overloads, and a scan through a custom registry is proven by test. Unknown layouts still fail loudly (Rust default). Reverses the "Layout is a fixed set, no SPI" design decision — the reference implementation treats layouts as runtime-pluggable. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Review findings on the LayoutDecoder SPI: - ChunkedLayoutDecoder decoded its leaves via a direct static FlatLayoutDecoder call, silently bypassing the registry — a custom decoder registered for a leaf id was not honored under a chunked parent, making the SPI partially decorative. Leaves now route through ctx.decodeChild; the end-to-end test asserts the flat delegator itself fires during a real scan. Integration oracle confirms identical behavior for built-ins (dict leaves under chunked included). - ScanLayoutContext.segmentSpec and DictLayoutDecoder child access now guard malformed indexes/arity with VortexException instead of leaking IndexOutOfBoundsException from untrusted input. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Owner
Author
|
Extended the PR with the LayoutDecoder SPI (fc488d0 + dd196f1): layout decode is now pluggable via |
Renamed after its exact Rust counterpart (SerializedArray in vortex-array/src/serde.rs: "a parsed but not-yet-decoded deserialized array" whose decode() resolves the encoding id against the spec table and consults the registry). "Flat" is a layout concept and "segment" a byte-range concept — the unit this class decodes is one serialized array message. VortexHandle's decodeFlatSegment follows as decodeSegment, next to rawSegment. Pitest target FQN and living docs updated; released changelog entries and ADR 0001 stay as written. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
ReadRegistry and LayoutRegistry map keys become EncodingId/LayoutId: since ArrayNode and Layout carry parsed typed ids, string-keyed dispatch just round-tripped through the wire form. Strings at the boundary, types inside. TreeMap orders by wire string via comparator — the sealed ids are not Comparable, and a Custom key must not throw. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Relative links rewritten: docs pages point at ../adr/, ADR upward references drop one level, ADR links into docs/ gain the prefix. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Follow-up to #193, completing the typed-id arc on the layout side:
7df3a0db—core.model.LayoutId: sealed interface with the sameWellKnown/Customshape asEncodingId. Layouts stay open — the Rust team's guidance (New third-party implementation: vortex-java vortex-data/vortex#8250) and the reference implementation both treat layouts as runtime-pluggable.Layout's misnamedString encodingIdcomponent becomesLayoutId layoutId; unknown layouts keep failing loudly (Rust's default — noallowUnknownfor layouts).b08ace79—Layout+ZonedStatsSchemamove to the newreader.layoutpackage, mirroringreader.decodeand giving the eventualLayoutDecoderSPI a landing zone.FlatSegmentDecoderdeliberately stays in the reader root (its only callers are there; moving it would force it back topublic). PitesttargetClassesFQN updated.7588aa31—UnknownArray.encodingIdbecomes a typedEncodingId(theStringwas a fossil from the closed-enum era).Compat fix (found by checking the Rust reference)
Rust renamed the zone-map layout id to
vortex.zoned, keepingvortex.statsas a legacy alias. Our reader only knewvortex.stats, so files from current Rust writers would fail layout dispatch. Now both ids route through the zoned path (parse and zone-map pruning — pinned by a parameterized test over both aliases). The writer keeps emittingvortex.stats, which old and new Rust readers both accept; the integration oracle confirms byte-identical output.Verification
./mvnw verifygreen after every commit — all 15 modules including the failsafe Rust-interop suite../mvnw javadoc:javadoc -pl core— zero output.🤖 Generated with Claude Code