[YSQL] Planner: recognize ROW(yb_hash_code(...), ...) as Index Cond on hash indexes#31035
[YSQL] Planner: recognize ROW(yb_hash_code(...), ...) as Index Cond on hash indexes#31035kirs wants to merge 6 commits intoyugabyte:masterfrom
Conversation
…n hash indexes Teach the planner to recognize ROW comparisons with a leading yb_hash_code(hash_cols) element as valid index conditions on hash-sharded indexes. For example: (yb_hash_code(h), h, r) > (10, 'a', 1) Previously, match_rowcompare_to_indexcol rejected all ROW comparisons on hash indexes because hash columns don't support range comparisons across buckets. However, when yb_hash_code is the leading element, the ROW comparison maps directly to the DocDB key encoding [hash_code, hash_cols, range_cols] and can serve as a row bound. Changes: 1. indxpath.c: Add yb_match_clause_to_index() which handles yb_hash_code OpExpr and RowCompareExpr matching before the per-column loop in match_clause_to_index(). Add yb_match_rowcompare_to_index() which validates that ROW elements after yb_hash_code form a prefix of the index key. Add yb_hash_code_match_index() as a reusable helper. 2. yb_scan.c: Fix YbBindRowComparisonKeys to not crash when a yb_hash_code subkey (sk_attno = InvalidAttrNumber) is present. For now, disable pushdown for such ROW comparisons (recheck at Postgres level) until the pggate EncodeRowKeyForBound API supports pre-computed hash codes. The planner change causes ROW(yb_hash_code, ...) to appear as Index Cond in EXPLAIN rather than Filter, enabling better plan selection. Actual DocDB bound pushdown will be enabled in a follow-up once the pggate layer is ready. Refs: yugabyte#30524
Three issues when a yb_hash_code() element appears as a subkey in a ROW comparison scan key (sk_attno = InvalidAttrNumber): 1. YbIsHashCodeSearch asserted sk_flags == YB_SK_SEARCHHASHCODE exactly, but ROW member subkeys also have SK_ROW_MEMBER. Relax the assert. 2. The attnum setup loop accessed indkey.values[sk_attno - 1] which is indkey.values[-1] for InvalidAttrNumber. Skip the mapping and set both target and bind attnums to InvalidAttrNumber. 3. ybcSetupScanKeys broke out of the loop on InvalidAttrNumber, which would skip processing any subsequent scan keys. Change to continue.
Require all ROW elements to match index key columns -- no partial prefix truncation. The lossy case caused executor crashes that need more investigation. Non-matching ROW comparisons fall back to Filter.
New test yb.orig.yb_hash_code_row_compare covering: - Full key ROW with leading yb_hash_code (single and multi-column hash) - Without leading yb_hash_code (negative) - Combined with yb_hash_code upper bound - Arity mismatch (negative) - Skipped column (negative) - Less-than direction - Cross-bucket ordering Expected output placeholder -- needs to be generated from a test run.
Generated from pg_regress run on a single-node debug cluster.
|
I've run the tests via: |
There was a problem hiding this comment.
Code Review
This pull request enhances YugabyteDB's query planner to better utilize hash indexes when yb_hash_code functions are used, particularly within ROW comparisons. It introduces logic to recognize yb_hash_code as an index condition and correctly handles its attributes during scan plan setup and key binding, preventing out-of-bounds access. New regression tests are added to validate this behavior. The review comments highlight a critical bug where the opfamilies list is missing in a constructed RowCompareExpr, which needs to be addressed to prevent execution failures. Additionally, it's suggested to populate indexcols for yb_hash_code clauses for improved consistency and metadata.
| List *new_ops = NIL; | ||
| List *new_largs = NIL; | ||
| List *new_rargs = NIL; | ||
| List *new_collids = NIL; | ||
| RowCompareExpr *rc; | ||
| RestrictInfo *new_rinfo; | ||
|
|
||
| for (i = 0; i < matching_cols; i++) | ||
| { | ||
| Oid new_opno; | ||
|
|
||
| new_opno = get_commutator(list_nth_oid(clause->opnos, i)); | ||
| if (!OidIsValid(new_opno)) | ||
| return NULL; | ||
|
|
||
| new_ops = lappend_oid(new_ops, new_opno); | ||
| /* Swap largs and rargs to put index keys on left. */ | ||
| new_largs = lappend(new_largs, | ||
| list_nth(castNode(List, clause->rargs), i)); | ||
| new_rargs = lappend(new_rargs, | ||
| list_nth(clause->largs, i)); | ||
| new_collids = lappend_oid(new_collids, | ||
| list_nth_oid(clause->inputcollids, i)); | ||
| } | ||
|
|
||
| rc = makeNode(RowCompareExpr); | ||
| rc->rctype = (RowCompareType) hc_strategy; | ||
| rc->opnos = new_ops; | ||
| rc->opfamilies = NIL; | ||
| rc->inputcollids = new_collids; |
There was a problem hiding this comment.
The newly constructed RowCompareExpr is missing the opfamilies list. Since this ROW comparison is intended to be rechecked at the Postgres level (as pushdown is currently disabled for these cases), the executor will need the opfamilies to identify the correct comparison functions. Without them, ExecInitRowCompare will fail to initialize properly during execution.
List *new_ops = NIL;
List *new_largs = NIL;
List *new_rargs = NIL;
List *new_collids = NIL;
List *new_opfamilies = NIL;
RowCompareExpr *rc;
RestrictInfo *new_rinfo;
for (i = 0; i < matching_cols; i++)
{
Oid new_opno;
Oid opfamily;
new_opno = get_commutator(list_nth_oid(clause->opnos, i));
if (!OidIsValid(new_opno))
return NULL;
new_ops = lappend_oid(new_ops, new_opno);
/* Swap largs and rargs to put index keys on left. */
new_largs = lappend(new_largs,
list_nth(castNode(List, clause->rargs), i));
new_rargs = lappend(new_rargs,
list_nth(clause->largs, i));
new_collids = lappend_oid(new_collids,
list_nth_oid(clause->inputcollids, i));
opfamily = (i == 0) ? INTEGER_LSM_FAM_OID : index->opfamily[i - 1];
new_opfamilies = lappend_oid(new_opfamilies, opfamily);
}
rc = makeNode(RowCompareExpr);
rc->rctype = (RowCompareType) hc_strategy;
rc->opnos = new_ops;
rc->opfamilies = new_opfamilies;
rc->inputcollids = new_collids;| iclause->indexquals = list_make1(rinfo); | ||
| iclause->lossy = false; | ||
| iclause->indexcol = 0; | ||
| iclause->indexcols = NIL; |
| iclause->lossy = false; | ||
| iclause->indexcol = 0; | ||
| iclause->indexcols = NIL; | ||
| return iclause; |
|
Thanks for submitting this. I've taken a brief look. |
andrei-mart
left a comment
There was a problem hiding this comment.
I do have couple pending changes so far on both sides, both with some limitations we need to discuss to integrate.
On execution side we expect efficient bounds set by YBCPgDmlAddRowLowerBound/ YBCPgDmlAddRowUpperBound while the Row condition matches the index prefix. Current implementation allows missing columns, and the pending change maintains that. Though the columns after the missing one won't participate in the bound, so it is recommended not to include them. When missing values are not allowed, it will be possible to use is_inclusive to implement non-lossy strict and non-strict inequalities. For now, the bounds are lossy, unless all the key values are provided.
On the planning side I've discovered that backward scan on the hash indexes does not work properly. So the pending change does not allow ORDER BY yb_hash_code(h) DESC, as well as WHERE yb_hash_code(h) = const ORDER BY h DESC. Not sure if that limitation affects the Row conditions case.
| * sk_attno = InvalidAttrNumber. Skip the normal attnum | ||
| * mapping to avoid an out-of-bounds access on indkey. | ||
| */ | ||
| if (key->sk_attno == InvalidAttrNumber) |
There was a problem hiding this comment.
If I remember correctly, the condition key->sk_attno == InvalidAttrNumber indicates that the scan key is on an expression, not on an attribute. Am I mistaken? The comment implies the case is only relevant to yb_hash_code subkey inside a ROW comparison, is it true, no other possibilities of InvalidAttrNumber in key->sk_attno?
Also, there's a special place for the yb_hash_code expressions: ybScan->hash_code_keys. Any pros and cons of ROW's yb_hash_codes not being there?
| * except SK_ROW_MEMBER which is set on yb_hash_code subkeys inside | ||
| * a RowCompareExpr. | ||
| */ | ||
| Assert(!is_hash_search || |
There was a problem hiding this comment.
I'd suggest to delete this assertion.
I'm not sure if there is any problem if YB_SK_SEARCHHASHCODE is combined with other flags.
| @@ -1601,8 +1619,15 @@ YbBindRowComparisonKeys(YbScanDesc ybScan, YbScanPlan scan_plan, | |||
| * We can only push down right now if the key columns are specified in the | |||
| * correct order and the key has no hashed columns. We also need to ensure | |||
There was a problem hiding this comment.
My understanding is that with yb_hash_code support you just delete and the key has no hashed columns. The rest of the statement stands, assuming the correct order means (hash_code, hash_keys, range_keys) for hash indexes and (range_keys) for range indexes.
| * Skip the leading yb_hash_code subkey. It has sk_attno = | ||
| * InvalidAttrNumber and cannot be validated against rd_indoption. | ||
| */ | ||
| if (j == 0 && YbIsHashCodeSearch(key)) |
There was a problem hiding this comment.
If the index is hash, the first scan key has to be a hash code, or can't pushdown and break.
We probably can't have a hash code scan key if the index isn't hash.
| } | ||
|
|
||
| /* Make sure that the specified keys are in the right order. */ | ||
| if (key->sk_attno <= last_att_no) |
There was a problem hiding this comment.
That does not look right. if (key->sk_attno != last_att_no +1) then can't pushdown and break.
Btw, what "can_pushdown" is supposed to mean here?
We can make a bound if we have one or more scan keys. It is slightly more complicated if the bound is "lossy".
| bool var_on_left; | ||
|
|
||
| /* Only btree/lsm indexes support row comparisons. */ | ||
| if (index->relam != BTREE_AM_OID && index->relam != LSM_AM_OID) |
There was a problem hiding this comment.
BTREE is a Postgres index format, still possible when dealing with temp tables.
If we want to handle only the hash case here, the condition is redundant.
The next one index->nhashcolumns == 0 is false iff the index is an LSM and a hash.
| return NULL; | ||
|
|
||
| /* Need at least yb_hash_code + one column. */ | ||
| if (list_length(clause->largs) < 2) |
There was a problem hiding this comment.
Always true? RowCompareExpr would always have at least two.
| case BTGreaterStrategyNumber: | ||
| break; | ||
| default: | ||
| return NULL; |
There was a problem hiding this comment.
Isn't it the same as
if (hc_strategy == BTEqualStrategyNumber)
return NULL;
?
| * Walk the remaining ROW elements. Each must reference the next index | ||
| * key column in order (forming a prefix), with a compatible operator. | ||
| */ | ||
| matching_cols = 1; |
There was a problem hiding this comment.
It looks like the logic below is similar to match_rowcompare_to_indexcol and expand_indexqual_rowcompare.
It may make sense to refactor and make some subroutines to reuse between here and there.
I'd take some time to think about this.
Refs: #30524, #30651, @andrei-mart to review.
This teaches the planner to recognize
ROW(yb_hash_code(hash_cols), ...) > (...)as a valid Index Cond on hash-sharded indexes.Previously, all ROW comparisons involving hash columns were rejected by
match_rowcompare_to_indexcol. This meant queries like:would place the ROW comparison as a post-scan Filter, scanning from the start of the index and discarding rows until the cursor position.
With this change, the ROW comparison appears as Index Cond, enabling better plan selection. Actual DocDB bound pushdown (seeking directly to the cursor position) will be enabled in a follow-up once the pggate EncodeRowKeyForBound changes land (WIP patch by @andrei-mart
(https://gist.github.com/andrei-mart/2755d1d02002040b5ba99f599f2f303f)).
Tests
New regression test yb.orig.yb_hash_code_row_compare added to yb_index_schedule covering:
What this does NOT include
Once the pggate patch lands, a follow-up will enable actual DocDB bound pushdown (seek instead of scan + recheck).