Skip to content

[YSQL] Planner: recognize ROW(yb_hash_code(...), ...) as Index Cond on hash indexes#31035

Open
kirs wants to merge 6 commits intoyugabyte:masterfrom
kirs:kirs/row-compare-hash-code-pushdown
Open

[YSQL] Planner: recognize ROW(yb_hash_code(...), ...) as Index Cond on hash indexes#31035
kirs wants to merge 6 commits intoyugabyte:masterfrom
kirs:kirs/row-compare-hash-code-pushdown

Conversation

@kirs
Copy link
Copy Markdown

@kirs kirs commented Apr 9, 2026

Refs: #30524, #30651, @andrei-mart to review.

This teaches the planner to recognize ROW(yb_hash_code(hash_cols), ...) > (...) as a valid Index Cond on hash-sharded indexes.

Previously, all ROW comparisons involving hash columns were rejected by match_rowcompare_to_indexcol. This meant queries like:

SELECT * FROM orders
WHERE (yb_hash_code(user_uuid), user_uuid, uuid) > (10, '01961f1f-...', '01968344-...')
AND yb_hash_code(user_uuid) < 11
LIMIT 5000;

would place the ROW comparison as a post-scan Filter, scanning from the start of the index and discarding rows until the cursor position.

With this change, the ROW comparison appears as Index Cond, enabling better plan selection. Actual DocDB bound pushdown (seeking directly to the cursor position) will be enabled in a follow-up once the pggate EncodeRowKeyForBound changes land (WIP patch by @andrei-mart
(https://gist.github.com/andrei-mart/2755d1d02002040b5ba99f599f2f303f)).

Tests

New regression test yb.orig.yb_hash_code_row_compare added to yb_index_schedule covering:

  • Full key ROW with leading yb_hash_code (single and multi-column hash) -> Index Cond
  • Without leading yb_hash_code -> Seq Scan (negative)
  • Combined with yb_hash_code upper bound -> both as Index Cond
  • Arity mismatch -> Seq Scan (negative)
  • Less-than direction -> Index Cond
  • Execution correctness with pinned hash bucket

What this does NOT include

  • pggate EncodeRowKeyForBound changes -- @andrei-mart is handling this separately. His patch makes col_values[0] accept a pre-computed hash code.
  • ORDER BY yb_hash_code(h), h, r pathkeys support -- also being handled by @andrei-mart.
  • Lossy/partial prefix ROW truncation -- disabled for now to avoid executor complexity; the ROW must cover all its elements as a valid index key prefix.

Once the pggate patch lands, a follow-up will enable actual DocDB bound pushdown (seek instead of scan + recheck).

kirs added 6 commits April 8, 2026 14:40
…n hash indexes

Teach the planner to recognize ROW comparisons with a leading
yb_hash_code(hash_cols) element as valid index conditions on
hash-sharded indexes.  For example:

  (yb_hash_code(h), h, r) > (10, 'a', 1)

Previously, match_rowcompare_to_indexcol rejected all ROW comparisons
on hash indexes because hash columns don't support range comparisons
across buckets.  However, when yb_hash_code is the leading element,
the ROW comparison maps directly to the DocDB key encoding
[hash_code, hash_cols, range_cols] and can serve as a row bound.

Changes:

1. indxpath.c: Add yb_match_clause_to_index() which handles
   yb_hash_code OpExpr and RowCompareExpr matching before the
   per-column loop in match_clause_to_index().  Add
   yb_match_rowcompare_to_index() which validates that ROW elements
   after yb_hash_code form a prefix of the index key.  Add
   yb_hash_code_match_index() as a reusable helper.

2. yb_scan.c: Fix YbBindRowComparisonKeys to not crash when a
   yb_hash_code subkey (sk_attno = InvalidAttrNumber) is present.
   For now, disable pushdown for such ROW comparisons (recheck at
   Postgres level) until the pggate EncodeRowKeyForBound API supports
   pre-computed hash codes.

The planner change causes ROW(yb_hash_code, ...) to appear as
Index Cond in EXPLAIN rather than Filter, enabling better plan
selection.  Actual DocDB bound pushdown will be enabled in a
follow-up once the pggate layer is ready.

Refs: yugabyte#30524
Three issues when a yb_hash_code() element appears as a subkey in a
ROW comparison scan key (sk_attno = InvalidAttrNumber):

1. YbIsHashCodeSearch asserted sk_flags == YB_SK_SEARCHHASHCODE exactly,
   but ROW member subkeys also have SK_ROW_MEMBER. Relax the assert.

2. The attnum setup loop accessed indkey.values[sk_attno - 1] which is
   indkey.values[-1] for InvalidAttrNumber. Skip the mapping and set
   both target and bind attnums to InvalidAttrNumber.

3. ybcSetupScanKeys broke out of the loop on InvalidAttrNumber, which
   would skip processing any subsequent scan keys. Change to continue.
Require all ROW elements to match index key columns -- no partial
prefix truncation.  The lossy case caused executor crashes that need
more investigation.  Non-matching ROW comparisons fall back to Filter.
New test yb.orig.yb_hash_code_row_compare covering:
- Full key ROW with leading yb_hash_code (single and multi-column hash)
- Without leading yb_hash_code (negative)
- Combined with yb_hash_code upper bound
- Arity mismatch (negative)
- Skipped column (negative)
- Less-than direction
- Cross-bucket ordering

Expected output placeholder -- needs to be generated from a test run.
Generated from pg_regress run on a single-node debug cluster.
@kirs kirs marked this pull request as ready for review April 9, 2026 14:27
@kirs
Copy link
Copy Markdown
Author

kirs commented Apr 9, 2026

I've run the tests via:

build/debug-clang21-dynamic-ninja/postgres_build/src/test/regress/pg_regress --inputdir=src/postgres/src/test/regress --outputdir=/work/regress_out --host=127.0.0.1 --port=5433 --user=yugabyte --dbname=yugabyte yb.orig.yb_hash_code_row_compare 2>&1

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances YugabyteDB's query planner to better utilize hash indexes when yb_hash_code functions are used, particularly within ROW comparisons. It introduces logic to recognize yb_hash_code as an index condition and correctly handles its attributes during scan plan setup and key binding, preventing out-of-bounds access. New regression tests are added to validate this behavior. The review comments highlight a critical bug where the opfamilies list is missing in a constructed RowCompareExpr, which needs to be addressed to prevent execution failures. Additionally, it's suggested to populate indexcols for yb_hash_code clauses for improved consistency and metadata.

Comment on lines +5405 to +5434
List *new_ops = NIL;
List *new_largs = NIL;
List *new_rargs = NIL;
List *new_collids = NIL;
RowCompareExpr *rc;
RestrictInfo *new_rinfo;

for (i = 0; i < matching_cols; i++)
{
Oid new_opno;

new_opno = get_commutator(list_nth_oid(clause->opnos, i));
if (!OidIsValid(new_opno))
return NULL;

new_ops = lappend_oid(new_ops, new_opno);
/* Swap largs and rargs to put index keys on left. */
new_largs = lappend(new_largs,
list_nth(castNode(List, clause->rargs), i));
new_rargs = lappend(new_rargs,
list_nth(clause->largs, i));
new_collids = lappend_oid(new_collids,
list_nth_oid(clause->inputcollids, i));
}

rc = makeNode(RowCompareExpr);
rc->rctype = (RowCompareType) hc_strategy;
rc->opnos = new_ops;
rc->opfamilies = NIL;
rc->inputcollids = new_collids;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The newly constructed RowCompareExpr is missing the opfamilies list. Since this ROW comparison is intended to be rechecked at the Postgres level (as pushdown is currently disabled for these cases), the executor will need the opfamilies to identify the correct comparison functions. Without them, ExecInitRowCompare will fail to initialize properly during execution.

			List	   *new_ops = NIL;
			List	   *new_largs = NIL;
			List	   *new_rargs = NIL;
			List	   *new_collids = NIL;
			List	   *new_opfamilies = NIL;
			RowCompareExpr *rc;
			RestrictInfo *new_rinfo;

			for (i = 0; i < matching_cols; i++)
			{
				Oid			new_opno;
				Oid			opfamily;

				new_opno = get_commutator(list_nth_oid(clause->opnos, i));
				if (!OidIsValid(new_opno))
					return NULL;

				new_ops = lappend_oid(new_ops, new_opno);
				/* Swap largs and rargs to put index keys on left. */
				new_largs = lappend(new_largs,
									list_nth(castNode(List, clause->rargs), i));
				new_rargs = lappend(new_rargs,
									list_nth(clause->largs, i));
				new_collids = lappend_oid(new_collids,
										  list_nth_oid(clause->inputcollids, i));

				opfamily = (i == 0) ? INTEGER_LSM_FAM_OID : index->opfamily[i - 1];
				new_opfamilies = lappend_oid(new_opfamilies, opfamily);
			}

			rc = makeNode(RowCompareExpr);
			rc->rctype = (RowCompareType) hc_strategy;
			rc->opnos = new_ops;
			rc->opfamilies = new_opfamilies;
			rc->inputcollids = new_collids;

iclause->indexquals = list_make1(rinfo);
iclause->lossy = false;
iclause->indexcol = 0;
iclause->indexcols = NIL;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency with the RowCompareExpr implementation below and to provide better metadata to the planner, indexcols should be populated with the virtual index column index (0) for the yb_hash_code clause.

			iclause->indexcols = list_make1_int(0);

iclause->lossy = false;
iclause->indexcol = 0;
iclause->indexcols = NIL;
return iclause;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency with the RowCompareExpr implementation below and to provide better metadata to the planner, indexcols should be populated with the virtual index column index (0) for the yb_hash_code clause.

			iclause->indexcols = list_make1_int(0);

@andrei-mart andrei-mart self-requested a review April 9, 2026 17:02
@andrei-mart
Copy link
Copy Markdown
Contributor

Thanks for submitting this. I've taken a brief look.
I'm off until Monday, I'll give you detailed feedback when I'll be back.

Copy link
Copy Markdown
Contributor

@andrei-mart andrei-mart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do have couple pending changes so far on both sides, both with some limitations we need to discuss to integrate.

On execution side we expect efficient bounds set by YBCPgDmlAddRowLowerBound/ YBCPgDmlAddRowUpperBound while the Row condition matches the index prefix. Current implementation allows missing columns, and the pending change maintains that. Though the columns after the missing one won't participate in the bound, so it is recommended not to include them. When missing values are not allowed, it will be possible to use is_inclusive to implement non-lossy strict and non-strict inequalities. For now, the bounds are lossy, unless all the key values are provided.

On the planning side I've discovered that backward scan on the hash indexes does not work properly. So the pending change does not allow ORDER BY yb_hash_code(h) DESC, as well as WHERE yb_hash_code(h) = const ORDER BY h DESC. Not sure if that limitation affects the Row conditions case.

* sk_attno = InvalidAttrNumber. Skip the normal attnum
* mapping to avoid an out-of-bounds access on indkey.
*/
if (key->sk_attno == InvalidAttrNumber)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, the condition key->sk_attno == InvalidAttrNumber indicates that the scan key is on an expression, not on an attribute. Am I mistaken? The comment implies the case is only relevant to yb_hash_code subkey inside a ROW comparison, is it true, no other possibilities of InvalidAttrNumber in key->sk_attno?

Also, there's a special place for the yb_hash_code expressions: ybScan->hash_code_keys. Any pros and cons of ROW's yb_hash_codes not being there?

* except SK_ROW_MEMBER which is set on yb_hash_code subkeys inside
* a RowCompareExpr.
*/
Assert(!is_hash_search ||
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to delete this assertion.
I'm not sure if there is any problem if YB_SK_SEARCHHASHCODE is combined with other flags.

@@ -1601,8 +1619,15 @@ YbBindRowComparisonKeys(YbScanDesc ybScan, YbScanPlan scan_plan,
* We can only push down right now if the key columns are specified in the
* correct order and the key has no hashed columns. We also need to ensure
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that with yb_hash_code support you just delete and the key has no hashed columns. The rest of the statement stands, assuming the correct order means (hash_code, hash_keys, range_keys) for hash indexes and (range_keys) for range indexes.

* Skip the leading yb_hash_code subkey. It has sk_attno =
* InvalidAttrNumber and cannot be validated against rd_indoption.
*/
if (j == 0 && YbIsHashCodeSearch(key))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the index is hash, the first scan key has to be a hash code, or can't pushdown and break.
We probably can't have a hash code scan key if the index isn't hash.

}

/* Make sure that the specified keys are in the right order. */
if (key->sk_attno <= last_att_no)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does not look right. if (key->sk_attno != last_att_no +1) then can't pushdown and break.
Btw, what "can_pushdown" is supposed to mean here?
We can make a bound if we have one or more scan keys. It is slightly more complicated if the bound is "lossy".

bool var_on_left;

/* Only btree/lsm indexes support row comparisons. */
if (index->relam != BTREE_AM_OID && index->relam != LSM_AM_OID)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTREE is a Postgres index format, still possible when dealing with temp tables.
If we want to handle only the hash case here, the condition is redundant.
The next one index->nhashcolumns == 0 is false iff the index is an LSM and a hash.

return NULL;

/* Need at least yb_hash_code + one column. */
if (list_length(clause->largs) < 2)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always true? RowCompareExpr would always have at least two.

case BTGreaterStrategyNumber:
break;
default:
return NULL;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it the same as

if (hc_strategy == BTEqualStrategyNumber)
    return NULL;

?

* Walk the remaining ROW elements. Each must reference the next index
* key column in order (forming a prefix), with a compatible operator.
*/
matching_cols = 1;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the logic below is similar to match_rowcompare_to_indexcol and expand_indexqual_rowcompare.
It may make sense to refactor and make some subroutines to reuse between here and there.
I'd take some time to think about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants