Use variant_combinations for query filtering and drop GIN index#3698
Use variant_combinations for query filtering and drop GIN index#3698mstaeble wants to merge 2 commits into
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: mstaeble The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
bd6fdd6 to
5b49d5a
Compare
Replace array containment checks on large tables with subqueries against the small variant_combinations table (2K rows), then filter by integer variant_combination_id. Benchmarked on staging: - TestReportExcludeVariants: 1,167ms → 55ms (21x faster) - TestsByNURPAndStandardDeviation: 505ms → 223ms (2.3x faster) - PlatformInfraSuccess: 508ms → 346ms (1.5x faster) - Collapsed matview filter: 1,158ms → 500ms (2.3x faster) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All variant filtering now resolves through the variant_combinations table via variant_combination_id, making the GIN index on the variants TEXT[] column unused. Dropping it eliminates write overhead on prow_jobs inserts and updates. TestOutputs and TestDurations queries updated to use variant_combination_id IN (subquery) instead of array containment. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5b49d5a to
5d890c9
Compare
|
@coderabbitai review |
✅ Action performedReview finished.
|
WalkthroughVariant filtering now resolves through ChangesVariant combination filtering
Sequence Diagram(s)sequenceDiagram
participant GetTestAnalysisOverallFromDB
participant variant_combinations
participant prow_jobs
GetTestAnalysisOverallFromDB->>variant_combinations: resolve requested variants to ids
variant_combinations-->>GetTestAnalysisOverallFromDB: matching variant_combination_id values
GetTestAnalysisOverallFromDB->>prow_jobs: filter rows by variant_combination_id IN / NOT IN
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Caution Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional.
❌ Failed checks (1 error, 2 warnings)
✅ Passed checks (18 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
pkg/db/query/misc_queries.go (1)
36-50: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd inline comments for the new CTE query sections.
target_variantschanges how platform matching works by resolving platform names tovariant_combination_id; document that intent in the SQL. As per path instructions, “BigQuery and SQL query-building code should have inline comments explaining the purpose of each major query section (CTEs, JOINs, window functions, WHERE clauses).”Suggested comment
WITH target_variants AS ( + -- Resolve requested platform variants to combination IDs used by the materialized view. SELECT vc.id, v.variant FROM variant_combinations vc, unnest(vc.variants) AS v(variant) WHERE v.variant IN `@platforms` )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/db/query/misc_queries.go` around lines 36 - 50, The new CTE-based SQL in the query-building code needs inline comments explaining each major section, especially the target_variants CTE, the JOIN to variant_combinations, and the filtering/grouping logic. Update the Raw query in the misc query function that builds the pass-percentage report so the intent of resolving platform names to variant_combination_id is documented directly in the SQL, with brief comments for the CTE and join sections.Source: Path instructions
pkg/db/query/test_queries.go (1)
116-134: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winDocument the new exclusion CTEs.
The new
excluded_vcCTEs drive the variant exclusion behavior, but the raw SQL does not explain that section. Add a short inline SQL comment so future query changes preserve the intendedvariant_combination_idsemantics. As per path instructions, “BigQuery and SQL query-building code should have inline comments explaining the purpose of each major query section (CTEs, JOINs, window functions, WHERE clauses).”Suggested comment
WITH excluded_vc AS ( + -- Resolve excluded variant names to combination IDs so rows can be filtered by the lookup key. SELECT id FROM variant_combinations WHERE `@excluded` && variants ),Also applies to: 175-193
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/db/query/test_queries.go` around lines 116 - 134, The raw SQL in the query-building block lacks an explanation for the new excluded-vc CTE and the variant_combination_id filter. Add a short inline SQL comment in the query construction around excluded_vc and the NOT IN clause to document that this CTE defines excluded variant combinations and preserves the intended exclusion semantics; apply the same comment pattern in the matching query section referenced by the test helper functions.Source: Path instructions
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@pkg/db/query/misc_queries.go`:
- Around line 36-50: The new CTE-based SQL in the query-building code needs
inline comments explaining each major section, especially the target_variants
CTE, the JOIN to variant_combinations, and the filtering/grouping logic. Update
the Raw query in the misc query function that builds the pass-percentage report
so the intent of resolving platform names to variant_combination_id is
documented directly in the SQL, with brief comments for the CTE and join
sections.
In `@pkg/db/query/test_queries.go`:
- Around line 116-134: The raw SQL in the query-building block lacks an
explanation for the new excluded-vc CTE and the variant_combination_id filter.
Add a short inline SQL comment in the query construction around excluded_vc and
the NOT IN clause to document that this CTE defines excluded variant
combinations and preserves the intended exclusion semantics; apply the same
comment pattern in the matching query section referenced by the test helper
functions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: c1df9af7-64f2-427b-bd2a-99842745a7eb
📒 Files selected for processing (8)
pkg/api/test_analysis.gopkg/db/db.gopkg/db/migrations/000004_drop_variants_gin_index.down.sqlpkg/db/migrations/000004_drop_variants_gin_index.up.sqlpkg/db/models/prow.gopkg/db/query/misc_queries.gopkg/db/query/test_queries.gopkg/db/views.go
|
@coderabbitai resolve |
✅ Action performedComments resolved and changes approved. |
|
Scheduling required tests: |
|
@mstaeble: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
variant_combinationslookup table (2K rows) first, then filter by integervariant_combination_id, instead of running array containment checks on large tablesprow_jobs.variantssince all filtering now goes throughvariant_combination_idBenchmark results (staging)
@excluded && variants)= any(variants))Changed functions
TestsByNURPAndStandardDeviation--variant_combination_id NOT IN (subquery)TestReportsByVariant--excluded_vcCTE +NOT INTestReportExcludeVariants-- same patternPlatformInfraSuccess-- CTE unnests fromvariant_combinations, joins matview by integer IDbuildCollapsedMatViewSQL--variant_combination_id NOT IN (subquery)TestOutputs/TestDurations--variant_combination_id IN (subquery)instead of GIN-indexed array containmentGetTestAnalysisOverallFromDB/GetTestAnalysisByJobFromDB-- same patternNULL handling note
The old variant exclusion pattern
NOT(variants is not null and @excluded && variants)included rows with NULL variants (evaluating toNOT(false)= true). The newvariant_combination_id NOT IN (subquery)pattern excludes NULL rows becauseNULL NOT IN (...)evaluates to NULL (falsy). This is acceptable because:variant_combination_idcomes fromtest_daily_summaries, which gets it fromprow_jobsvia the daily summary upsert JOINvariant_combination_id = NULLonly whenprow_jobs.variants IS NULLvariant_combination_id-- the variant manager always assigns variants to jobsTest plan
go build ./...andgo test ./pkg/db/... ./pkg/api/...pass/api/tests(collapsed)TestsByNURPAndStandardDeviation+ collapsed matview/api/tests(variant filter)/api/tests/detailsTestReportsByVariant/api/tests/outputsTestOutputs/api/tests/durationsTestDurations/api/variantsVariantReports/api/installPlatformInfraSuccess/api/tests/analysis/overallGetTestAnalysisOverallFromDB/api/tests/analysis/jobsGetTestAnalysisByJobFromDB🤖 Generated with Claude Code
Summary by CodeRabbit