Move release metadata from BigQuery to PostgreSQL#3679
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: mstaeble The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
eccf470 to
c3dca6d
Compare
Create a release_definitions table to store release metadata (GA dates, development start dates, previous release, capabilities, product, status) that was previously only available in BigQuery. This eliminates the BQ dependency for the /api/releases endpoint and removes the hardcoded releaseMetadata map from the PostgreSQL data provider, which required manual updates for each new release. Key changes: - Add ReleaseDefinition model with capability constants and HasCapability method - Add release-definitions loader (--loader release-definitions) that fetches from BQ and syncs to PG via upsert - getReleases() in the server prefers PG, falls back to BQ - PG data provider QueryReleases() reads from release_definitions instead of deriving from prow_jobs + hardcoded map - Seed data populates release_definitions for local development - Fix stale "from big query" error messages in server.go Ref: TRT-2734 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
c3dca6d to
23f283f
Compare
|
Scheduling required tests: |
|
@mstaeble: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| func (s *Server) getReleases(ctx context.Context, forceRefresh ...bool) ([]sippyv1.Release, error) { | ||
| if s.bigQueryClient != nil { | ||
| refresh := len(forceRefresh) > 0 && forceRefresh[0] | ||
| return api.GetReleases(ctx, s.bigQueryClient, refresh) | ||
| // getReleases returns release data from PostgreSQL. | ||
| func (s *Server) getReleases(ctx context.Context) ([]sippyv1.Release, error) { | ||
| if s.db != nil { | ||
| releases, err := api.GetReleasesFromDB(ctx, s.db) |
There was a problem hiding this comment.
this removes the cache. reading from a small postgres table is fast, but not nearly as fast as reading from redis, and this page gets called.... a lot. with React calling it several times per UI transition it may even be noticeable at a human timescale. consider whether we don't want caching here?
There was a problem hiding this comment.
Good call raising this. I traced all the callers to understand the full impact.
getReleases() is called by 8 HTTP handlers:
- /api/releases (page load, ~15 calls/hour)
- /api/component_readiness (every CR request)
- /api/component_readiness/test_details
- /api/component_readiness/views
- 4 triage/regression endpoints
The old Redis cache (8h TTL, key "Releases~") was valuable because it avoided a BigQuery round-trip on every one of these requests. That saved both latency and BQ query cost.
With PostgreSQL, the query is ~7ms for ~20 rows. The CR endpoints themselves take seconds, so 7ms is noise. There's no per-query cost like BQ.
More importantly, these callers are transitional. The CR handlers use release data for two things: resolving relative time strings like "ga-30d" into absolute dates, and generating HATEOAS links with correct time parameters. As we move CR fully to PostgreSQL:
- The CR test status queries will use pre-aggregated matviews per release (e.g., cr_base_agg_4.21), where the GA-based time window is baked into the matview at build time. No release metadata lookup needed at request time.
- For any queries that still need release dates, the lookup can fold into the main query as a JOIN against release_definitions rather than a separate round-trip.
That leaves /api/releases as the only endpoint that needs a standalone query, at ~15 calls/hour. Adding a Redis cache layer for that adds complexity with no user-visible benefit. The HTTP round-trip overhead alone (~50-200ms) dwarfs the 7ms PG query.
If we see this become a bottleneck in practice, we can add caching for PQ later, but I don't think it's warranted now.
Summary
release_definitionstable in PostgreSQL to store release metadata (GA dates, development start dates, previous release, capabilities, product, status) previously only available in BigQueryrelease-definitionsloader (--loader release-definitions) that fetches release rows from BQ and syncs them to PostgreSQL during the data load cycle/api/releasesand the PG data provider now read from therelease_definitionstable instead of BigQuery or the hardcodedreleaseMetadatamapgetReleases()in the server prefers PG, falls back to BQA follow-up PR will replace
v1.Releasewithmodels.ReleaseDefinitionacross all internal consumers, removeQueryReleases/QueryReleaseDatesfrom theDataProviderinterface, and eliminate the remaining BQ release functions.Test plan
go build ./...passesgo vet ./...passesmake lintpassesgo test ./pkg/... ./cmd/...passessippy servewith seed data serves/api/releasesfrom PostgreSQL with correct GA dates, capabilities, and previous release chainStaging verification
Deployed the branch to
sippy-stagingand ran the release-definitions loader as a one-off job:Synced 36 release definitions from BigQuery to staging PostgreSQL (all OCP releases 3.11 through 5.0, OKD, ROSA, ARO, HyperShift, and CAPI entries).
Verified the following endpoints on
sippy-staging.dptools.openshift.org:GET /api/releasesGET /api/releases/health?release=4.22GET /api/component_readiness(4.21→4.22 with full params)GET /api/component_readiness/regressions?release=4.22All release data served from PostgreSQL with no BigQuery calls.
Ref: TRT-2734
@coderabbitai ignore
🤖 Generated with Claude Code