perf: VLE terminal-qual rewrite#2420
Conversation
There was a problem hiding this comment.
Pull request overview
This PR rewrites how Cypher variable-length edge (VLE) terminal qualifications are expressed in generated SQL plans, shifting from per-row C qual functions to planner-friendly equality join clauses by exposing VLE endpoints as scalar SRF output columns.
Changes:
age_vle(...)now returns a composite row(edges, start_id, end_id)to expose path endpoints without per-row qual function calls.- Cypher transformer now emits terminal-edge and two-VLE-edge join conditions as plain
graphidequality expressions instead of callingage_match_vle_terminal_edge/age_match_two_vle_edges. - Removes fresh-install SQL declarations for the old qual functions, adds upgrade-time drops, and keeps C symbols as error-raising stubs for upgrade-test symbol resolution.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/include/parser/cypher_transform_entity.h |
Adds vle_alias to track SRF alias for emitting endpoint ColumnRefs. |
src/backend/parser/cypher_transform_entity.c |
Initializes new vle_alias field. |
src/backend/parser/cypher_clause.c |
Replaces VLE qual function calls with equality A_Exprs referencing SRF start_id/end_id. |
src/backend/utils/adt/age_vle.c |
Adds endpoint cache to VLE container header; changes SRF to emit composite tuples; stubs removed qual functions. |
sql/agtype_typecast.sql |
Updates age_vle SQL signature to SETOF record with (edges,start_id,end_id) OUT params; removes old qual function declarations. |
age--1.7.0--y.y.y.sql |
Upgrade script drops old qual functions and recreates updated age_vle signatures. |
regress/expected/*.out |
Updates regression expected output ordering impacted by plan/row ordering changes. |
Comments suppressed due to low confidence (1)
src/backend/utils/adt/age_vle.c:1456
container_size_bytesis computed from anint64 path_sizebut stored in anintand used forpalloc0()/SET_VARSIZE(). For large paths this can overflow/truncate and lead to under-allocation + memory corruption when fillinggraphid_array. Please useSize(orint64) and Postgres’ checked size arithmetic helpers (e.g.,mul_size/add_size) before allocating.
VLE_path_container *vpc = NULL;
int container_size_bytes = 0;
/*
* For the total container size (in graphids int64s) we need to add the
* following space (in graphids) to hold each of the following fields -
*
* One for the VARHDRSZ which is a int32 and a pad of 32.
* One for both the header and graph oid (they are both 32 bits).
* One for the size of the graphid_array_size.
* One for the container_size_bytes.
* One for start_vid (Stage 1: inline endpoint cache).
* One for end_vid (Stage 1: inline endpoint cache).
*
*/
container_size_bytes = sizeof(graphid) * (path_size + 6);
/* allocate the container */
vpc = palloc0(container_size_bytes);
/* initialize the PG headers */
SET_VARSIZE(vpc, container_size_bytes);
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Assert(entity->vle_alias != NULL); | ||
|
|
||
| cr_start = makeNode(ColumnRef); | ||
| cr_start->fields = list_make2(makeString(entity->vle_alias), | ||
| makeString("start_id")); | ||
| cr_start->location = -1; | ||
|
|
||
| cr_end = makeNode(ColumnRef); | ||
| cr_end->fields = list_make2(makeString(entity->vle_alias), | ||
| makeString("end_id")); | ||
| cr_end->location = -1; |
There was a problem hiding this comment.
entity->vle_alias is only protected by Assert(). In production builds where asserts are disabled, a missing alias would lead to NULL being passed into makeString() and likely a crash while building the ColumnRef. Please replace the Assert with a runtime check (ereport(ERROR, ...) with a useful message) before constructing cr_start/cr_end.
e819654 to
7636873
Compare
7636873 to
7a457d2
Compare
perf: VLE terminal-qual rewrite — emit endpoint equalities instead of SRF
qual functions.
Removes the per-row age_match_vle_terminal_edge and age_match_two_vle_edges
qual functions from VLE query plans. The cypher transformer now emits the
endpoint match as a plain graphid/int8 equality on new SRF output columns,
evaluated by the planner like any other join clause — no detoasting, no
per-row C function dispatch. Stages land as one commit:
S1 Inline start_vid/end_vid in VLE_path_container header
S2 Read VLE qual endpoints from header-only TOAST slice
S4 Emit start_id/end_id as scalar SRF output columns
(age_vle now RETURNS SETOF record with edges/start_id/end_id)
S5 Cypher transformer rewrites terminal-edge match quals as
integer equalities (drops age_match_vle_terminal_edge call)
S6 Cypher transformer emits graphid equality for two-VLE-edge
joins (drops age_match_two_vle_edges call)
Performance (SF3 LDBC SNB, 5 runs/3 warmup, vs clean master baseline_v2):
IC sum 198,958 → 109,322 ms −45.05 % (1.82× end-to-end speedup)
IC1 8,625 → 4,600 ms −46.67 %
IC3 21,239 → 9,784 ms −53.93 %
IC5 21,051 → 5,696 ms −72.94 %
IC6 15,916 → 4,447 ms −72.06 %
IC9 44,839 → 21,161 ms −52.81 %
IC10 13,104 → 2,432 ms −81.44 %
IC11 11,676 → 241 ms −97.93 % (48× speedup)
IC2/4/7/8/12: parity (within ±3.3 %; IC4 is −2.47 %, no regression)
IS sum: 1,009 → 1,004 ms −0.51 % (no VLE traffic)
IU sum: 77 → 71 ms −8.38 % (IU1 −16.09 %; incidental)
Memory: header-only TOAST slice for VLE qual evaluation avoids
detoasting full path containers on every row; reduces per-call
palloc/pfree churn in long DFS paths. No measured RSS change.
Dead-code removal:
- Bodies of age_match_vle_terminal_edge and age_match_two_vle_edges
are gone from age_vle.c (~225 lines). C entry points remain as
error-raising stubs solely so the upgrade-test snapshot loader
(which sources an older 1.7.0_initial SQL against the current
age.so) can resolve the symbols before the immediate ALTER
EXTENSION UPDATE drops them. No regress test references either
function.
- SQL CREATE FUNCTION declarations removed from fresh install
(sql/agtype_typecast.sql).
- DROP FUNCTION IF EXISTS for both qual functions added to the
upgrade script (age--1.7.0--y.y.y.sql).
API change: ag_catalog.age_vle(...) now RETURNS SETOF record with
output columns (edges agtype, start_id graphid, end_id graphid)
instead of RETURNS SETOF agtype. Both 7-arg and 8-arg overloads
are updated in fresh-install (sql/agtype_typecast.sql) and upgrade
(age--1.7.0--y.y.y.sql) paths. age_match_vle_terminal_edge and
age_match_two_vle_edges are dropped on upgrade and absent from
fresh installs. Internal AGE callers are unaffected; external SQL
that called any of these directly must adapt.
Hardening (in response to PR apache#2420 review feedback):
- cypher_clause.c: terminal-edge and two-VLE-edge join-qual
emission paths now bracket the existing Assert(vle_alias != NULL)
with a runtime ereport(ERROR, ...) so a missing alias produces a
clean error in production builds (where Asserts compile out)
instead of a NULL-deref crash inside makeString().
- age_vle.c: VLE_path_container struct comment now documents that
the layout is transient (consumed within the producing query and
never persisted), so the new start_vid/end_vid fields do not
require an AGT_FBINARY_TYPE_VLE_PATH version bump or backward-
compatible reader. The note also flags the constraint a future
change would need to honor if the container ever became
persistable.
Tested on PostgreSQL 18.3 (REL_18_STABLE): all 34 regression tests
pass (installcheck), warning-free build.
modified: age--1.7.0--y.y.y.sql
modified: regress/expected/cypher_match.out
modified: regress/expected/cypher_vle.out
modified: regress/expected/expr.out
modified: sql/agtype_typecast.sql
modified: src/backend/parser/cypher_clause.c
modified: src/backend/parser/cypher_transform_entity.c
modified: src/backend/utils/adt/age_vle.c
modified: src/include/parser/cypher_transform_entity.h
7a457d2 to
c371793
Compare
|
@MuhammadTahaNaveed Rebased and addressed copilots concerns. |
perf: VLE terminal-qual rewrite — emit endpoint equalities instead of SRF qual functions.
Removes the per-row age_match_vle_terminal_edge and age_match_two_vle_edges qual functions from VLE query plans. The cypher transformer now emits the endpoint match as a plain graphid/int8 equality on new SRF output columns, evaluated by the planner like any other join clause — no detoasting, no per-row C function dispatch. Stages land as one commit:
S1 Inline start_vid/end_vid in VLE_path_container header
S2 Read VLE qual endpoints from header-only TOAST slice
S4 Emit start_id/end_id as scalar SRF output columns
(age_vle now RETURNS SETOF record with edges/start_id/end_id)
S5 Cypher transformer rewrites terminal-edge match quals as
integer equalities (drops age_match_vle_terminal_edge call)
S6 Cypher transformer emits graphid equality for two-VLE-edge
joins (drops age_match_two_vle_edges call)
Performance (SF3 LDBC SNB, 5 runs/3 warmup, vs clean master baseline_v2):
IC sum 198,958 → 109,322 ms −45.05 % (1.82× end-to-end speedup)
IC1 8,625 → 4,600 ms −46.67 %
IC3 21,239 → 9,784 ms −53.93 %
IC5 21,051 → 5,696 ms −72.94 %
IC6 15,916 → 4,447 ms −72.06 %
IC9 44,839 → 21,161 ms −52.81 %
IC10 13,104 → 2,432 ms −81.44 %
IC11 11,676 → 241 ms −97.93 % (48× speedup)
IC2/4/7/8/12: parity (within ±3.3 %; IC4 is −2.47 %, no regression)
IS sum: 1,009 → 1,004 ms −0.51 % (no VLE traffic)
IU sum: 77 → 71 ms −8.38 % (IU1 −16.09 %; incidental)
Memory: header-only TOAST slice for VLE qual evaluation avoids detoasting full path containers on every row; reduces per-call palloc/pfree churn in long DFS paths. No measured RSS change.
Dead-code removal:
API change: ag_catalog.age_vle(...) now RETURNS SETOF record with output columns (edges agtype, start_id graphid, end_id graphid) instead of RETURNS SETOF agtype. Both 7-arg and 8-arg overloads are updated in fresh-install (sql/agtype_typecast.sql) and upgrade (age--1.7.0--y.y.y.sql) paths. age_match_vle_terminal_edge and age_match_two_vle_edges are dropped on upgrade and absent from fresh installs. Internal AGE callers are unaffected; external SQL that called any of these directly must adapt.
Tested on PostgreSQL 18.3 (REL_18_STABLE): all 34 regression tests pass (installcheck), warning-free build.
Co-authored-by: Claude noreply@anthropic.com
modified: age--1.7.0--y.y.y.sql
modified: regress/expected/cypher_match.out
modified: regress/expected/cypher_vle.out
modified: regress/expected/expr.out
modified: sql/agtype_typecast.sql
modified: src/backend/parser/cypher_clause.c
modified: src/backend/parser/cypher_transform_entity.c
modified: src/backend/utils/adt/age_vle.c
modified: src/include/parser/cypher_transform_entity.h