Skip to content

fix: array traversal drops streaming indices from variable binding#2655

Open
lawrence3699 wants to merge 1 commit intomikefarah:masterfrom
lawrence3699:fix/streaming-index-traversal
Open

fix: array traversal drops streaming indices from variable binding#2655
lawrence3699 wants to merge 1 commit intomikefarah:masterfrom
lawrence3699:fix/streaming-index-traversal

Conversation

@lawrence3699
Copy link
Copy Markdown

Fixes #2593

Problem

When streaming values pipe into an array traversal with a bound variable, only the first index is used. For example:

$ echo '["a","b"]' | yq '. as $o | keys[] | $o[.]'
a
# expected: a, b

The workaround of adding an intermediate variable binding (. as $k |) works, but shouldn't be necessary:

$ echo '["a","b"]' | yq '. as $o | keys[] | . as $k | $o[.]'
a
b

Root cause

In traverseArrayOperator, the RHS collect expression is evaluated with the full streaming context. When EvaluateTogether is false (the default for streaming values), the collect operator creates a separate SequenceNode per streaming value. Line 126 then only takes the first SequenceNode's content, discarding the indices from all other streaming values.

Fix

When the RHS produces more SequenceNodes than the LHS has matching nodes (i.e., the LHS is a single bound variable but the index varies per streaming value), traverse each set of indices separately against the full LHS.

Before/After

# Before
$ echo '["a","b"]' | yq '. as $o | keys[] | $o[.]'
a

# After
$ echo '["a","b"]' | yq '. as $o | keys[] | $o[.]'
a
b

Works for both arrays and maps:

$ echo '{"x":"a","y":"b"}' | yq '. as $o | keys[] | $o[.]'
a
b

Testing

Added three regression tests covering arrays, maps, and longer sequences. Full test suite passes:

ok  github.com/mikefarah/yq/v4          0.353s
ok  github.com/mikefarah/yq/v4/cmd      0.447s
ok  github.com/mikefarah/yq/v4/pkg/yqlib 0.491s
ok  github.com/mikefarah/yq/v4/test     0.468s

When streaming values (e.g. from keys[]) pipe into an array
traversal like $o[.], only the first streaming index was used.

The RHS collect expression produces a separate SequenceNode per
streaming value when EvaluateTogether is false. The existing code
at line 126 only took the first SequenceNode, discarding the rest.

When the LHS is a single bound variable and the RHS has multiple
index sets, traverse each set separately.

Fixes mikefarah#2593
Copilot AI review requested due to automatic review settings April 9, 2026 09:24
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a streaming/traversal mismatch in traverseArrayOperator where array/map indexing via a bound variable (e.g. $o[.]) would only use the first streamed index set, dropping subsequent indices from the stream.

Changes:

  • Update traverseArrayOperator to traverse the LHS against each per-stream RHS index set when the RHS yields multiple collected sequences.
  • Add regression tests covering arrays, maps, and longer sequences for the $o[.] streaming-keys case.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
pkg/yqlib/operator_traverse_path.go Adjusts traversal logic to handle multiple RHS “collected index” sequences without dropping streamed indices.
pkg/yqlib/operator_traverse_path_test.go Adds regression scenarios ensuring streaming keys[] indices correctly traverse bound variables for arrays/maps.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +127 to +133
// When streaming values produce per-value index sets in the RHS and the LHS
// is a single node (e.g. a bound variable), traverse the LHS with each
// index set separately. Without this, only the first set of indices is used
// and the rest are silently dropped.
if rhs.MatchingNodes.Len() > 1 && lhs.MatchingNodes.Len() < rhs.MatchingNodes.Len() {
var allResults = list.New()
for el := rhs.MatchingNodes.Front(); el != nil; el = el.Next() {
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new branch is intended to handle the case where the LHS is a single node (e.g. a bound variable), but the condition lhs.MatchingNodes.Len() < rhs.MatchingNodes.Len() can also be true when the LHS has multiple matches (e.g. 2 LHS nodes, 3 RHS sequences). In that case this code traverses each RHS index set against all LHS nodes, which can introduce an unintended cross-product of results. Consider tightening the condition to explicitly guard on lhs.MatchingNodes.Len() == 1 (or <= 1) so the multi-RHS traversal only applies to the intended bound-variable scenario.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Extracting object values by iterating via keys causes missing items

2 participants