Skip to content

Tighten canonical-repo match in queryTargetsDependingOnModules#338

Open
rdark wants to merge 2 commits into
Tinder:masterfrom
rdark:fix/query-modules-canonical-match
Open

Tighten canonical-repo match in queryTargetsDependingOnModules#338
rdark wants to merge 2 commits into
Tinder:masterfrom
rdark:fix/query-modules-canonical-match

Conversation

@rdark
Copy link
Copy Markdown
Contributor

@rdark rdark commented Apr 28, 2026

Partial fix for #335.

Two commits:

1. Restrict canonical-repo match in queryTargetsDependingOnModules

On MODULE.bazel-touching commits the impacted-targets path used contains(moduleName) against full target labels to find which canonical repos to feed into rdeps. A module named "cpp" pulled in every abseil-cpp+//... label; in workspaces with many bzlmod module extensions the unioned rdeps(//..., @@a//... + @@b//... + ...) query swept far more of the workspace than it should have.

Replace the substring scan with a name-prefix predicate against the parsed canonical repo (label.substring(2).substringBefore("//")): R belongs to M if R starts with "{M.name}+" or "{M.name}~". Bazel's bzlmod canonical names are deterministic from the module name (per the module() spec), and extension-created name++ext+repo / name~~ext~repo forms are subsumed by these prefixes.

Also fixes a pre-existing bug in the same function: computeSimpleImpactedTargets(emptyMap(), allTargets) returned allTargets.keys, silently unioning every target back into the rdeps result. Threads from through the call chain so the union uses the real hash diff

Rejects empty module.name in ModuleGraphParser.extractModules. Bazel's MODULE.bazel spec disallows empty names, but bazel mod graph --output=json emits name="" for the synthetic root entry of a MODULE.bazel that omits module(name=…). Filtering at the parser stops that case from reaching the canonical-repo predicate.

2. Make rdeps-failure fallback shape-aware for bzlmod-only workspaces

The rdeps-failure catch block filtered out every @@ label on the assumption that the hash set was mostly workspace-local //... targets. That assumption breaks on bzlmod-only workspaces where generate-hashes emits almost entirely @@canonical labels plus a small number of //external:<apparent> synthetic targets. A single failed rdeps query would leave the filter with only the synthetic targets, which excludeExternalTargets=true (#334) then strips entirely. Replaces the unconditional filter with a shape-aware fallback: emit the buildable workspace subset (//... minus //external:) when it exists, otherwise fall through to every hashed label.

Tests in CalculateImpactedTargetsInteractorModuleQueryTest cover: canonical + form, canonical ~ form, extension-created-repo subsume, the "cpp must not match abseil-cpp" regression, transitive-by-name-prefix, ghost-module skip, multiple-disjoint-modules, version-bump-as-add+remove dedup, both branches of the bzlmod-only/mixed-workspace fallback, and the executeWithDistances module-query path. Output assertions on the empty-rdeps cases guard against regression to the emptyMap() union bug. New ModuleGraphParserTest case covers the empty-name rejection.


Notable changes vs. the original description:

  • Drops Tier A/B framing and related test - removed Tier A entirely after concluding it was redundant with Tier B (Bazel canonicals are deterministic from module.name, so the Tier A repo-mapping lookup added nothing the name-prefix match didn't already cover).
  • Add parser-level filter

@rdark rdark force-pushed the fix/query-modules-canonical-match branch 2 times, most recently from e031d63 to 1f737d4 Compare April 28, 2026 10:03
@tinder-maxwellelliott
Copy link
Copy Markdown
Collaborator

@rdark There is a conflict now. Thanks again for this

@rdark rdark force-pushed the fix/query-modules-canonical-match branch from 1f737d4 to 0d4922c Compare May 14, 2026 08:48
rdark added 2 commits May 14, 2026 12:59
On MODULE.bazel-touching commits the impacted-targets path did
`contains(moduleName)` against full target labels to find canonical repos
to query. A module named "cpp" pulled in every `abseil-cpp+//...` label;
in workspaces with many bzlmod module extensions the resulting unioned
rdeps query swept far more of the workspace than it should have.

Replace the substring scan with a name-prefix predicate against the parsed
canonical repo (`label.substring(2).substringBefore("//")`): R belongs to
M if R starts with `"{M.name}+"` or `"{M.name}~"`. Bazel's bzlmod canonical
names are deterministic from the module name, and extension-created
`name++ext+repo` / `name~~ext~repo` forms are subsumed by these prefixes.

Also fix a pre-existing bug in the same function: the trailing
`computeSimpleImpactedTargets(emptyMap(), allTargets)` call returned
`allTargets.keys`, silently unioning every target back into the rdeps
result. Thread `from` through the call chain so the union uses the real
hash diff. Without this, the rdeps filtering would be cost-only — the
impacted set would still equal the full workspace.

Reject empty `module.name` in `ModuleGraphParser.extractModules`: Bazel's
MODULE.bazel spec disallows empty names, but `bazel mod graph
--output=json` emits `name=""` for the synthetic root entry of a
MODULE.bazel that omits `module(name=…)`. Filtering at the parser stops
that case from reaching the canonical-repo predicate.

Tests: new CalculateImpactedTargetsInteractorModuleQueryTest covers
canonical-plus, canonical-tilde, extension-created subsume, the
"cpp must not match abseil-cpp" regression, transitive-by-name-prefix,
ghost-module skip, multiple-disjoint-modules, and the version-bump-as-
add+remove dedup case. New parser test covers empty-name rejection.
The per-repo rdeps query-failure fallback filtered out every `@@` label
on the assumption that the hash set was mostly workspace-local `//...`
targets. That assumption does not hold on bzlmod-only workspaces where
`generate-hashes` emits almost entirely `@@canonical` labels plus a small
number of `//external:<apparent>` bzlmod-synthetic bridges, with zero
workspace-local targets.

On such workspaces, a single failed `rdeps(//..., @@<repo>//...)` query
left the filter with only the `//external:*` bridges, which the
downstream `excludeExternalTargets=true` default (Tinder#334) then strips
entirely — producing silently-empty impacted output on what should be a
"conservatively rebuild everything" signal.

Replace the unconditional `!startsWith("@@")` filter with a shape-aware
fallback: if the buildable workspace subset (`//...` minus `//external:`)
is non-empty, emit it; otherwise fall through to every hashed label so
the downstream filter has something to keep. The "buildable workspace"
predicate is extracted as `isBuildableWorkspaceTarget` and explicitly
distinguished from the user-controlled `excludeExternalTargets` filter.

Adds bzlmod-only and mixed-workspace regression tests covering both
branches of the fallback.
@rdark rdark force-pushed the fix/query-modules-canonical-match branch from 0d4922c to 9c3b424 Compare May 14, 2026 12:01
@rdark
Copy link
Copy Markdown
Contributor Author

rdark commented May 14, 2026

@tinder-maxwellelliott fixed up the conflicts and simplified this one a bit - thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants