Skip to content

feat(BA-5936): add Pruner repository abstraction#11460

Draft
fregataa wants to merge 4 commits intomainfrom
feat/BA-5936-pruner-abstraction
Draft

feat(BA-5936): add Pruner repository abstraction#11460
fregataa wants to merge 4 commits intomainfrom
feat/BA-5936-pruner-abstraction

Conversation

@fregataa
Copy link
Copy Markdown
Member

@fregataa fregataa commented May 1, 2026

Summary

  • Introduce PrunerSpec[TRow] abstraction parallel to existing Creator/Updater/Purger patterns. Subclasses declaratively encode the entity's prune contract via row_class(), returning_id(), prune_condition(), and optional entity_type() classmethods; per-call params (conditions, cascade, limit, cascade_rbac) live on the instance.
  • CascadeChild abstraction handles FK-dependent child tables (e.g., kernels of pruned sessions); entity_type() opts into RBAC association cleanup baked into execute_pruner for cross-cutting association_scopes_entities semantics.
  • execute_pruner runs SELECT pk FOR UPDATE LIMIT once, materializes IDs, then issues cascade DELETEs and the parent DELETE...RETURNING from the same locked set — race-safe and avoids re-evaluating the parent SELECT per cascade. Hard-capped at DEFAULT_PRUNE_LIMIT = 100_000 to bound lock set, memory, and transaction duration.

Test plan

  • pants test tests/unit/manager/repositories/base/test_pruner.py — 11 tests covering basic prune, runtime conditions, limit, FK cascade, RBAC cleanup with entity_type filtering, and the default no-op (entity_type=None) path.
  • pants fmt, pants fix, pants lint, pants check all pass on changed files.
  • CI: full type check + test suite.

Notes

This adds the abstraction only. Concrete per-entity Pruner specs (session, vfolder, etc.) and the v2 REST prune endpoint follow as separate PRs under epic BA-5935.

Resolves BA-5936

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings May 1, 2026 18:17
@github-actions github-actions Bot added size:XL 500~ LoC comp:manager Related to Manager component labels May 1, 2026
fregataa added a commit that referenced this pull request May 1, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new repository-layer “Pruner” abstraction for race-safe bulk deletions, aligned with existing Creator/Updater/Purger patterns, plus integration tests exercising cascading deletes and optional RBAC association cleanup.

Changes:

  • Introduce PrunerSpec, CascadeChild, PrunerResult, and execute_pruner() to perform SELECT ... FOR UPDATE-based pruning with optional cascades.
  • Add RBAC association cleanup support via PrunerSpec.entity_type() + cascade_rbac.
  • Add real-DB integration tests covering basic pruning, runtime conditions, limit behavior, FK cascades, and RBAC cleanup.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
src/ai/backend/manager/repositories/base/pruner.py New pruner abstraction + executor implementing lock-then-delete flow with cascade support.
src/ai/backend/manager/repositories/base/__init__.py Re-export new pruner APIs from the base repository package.
tests/unit/manager/repositories/base/test_pruner.py Integration tests validating pruning, cascade deletes, and RBAC cleanup behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/ai/backend/manager/repositories/base/pruner.py
Comment thread src/ai/backend/manager/repositories/base/pruner.py Outdated
Comment thread src/ai/backend/manager/repositories/base/pruner.py Outdated
Comment thread src/ai/backend/manager/repositories/base/pruner.py Outdated
Comment thread tests/unit/manager/repositories/base/test_pruner.py Outdated
@fregataa fregataa marked this pull request as draft May 2, 2026 05:13
fregataa and others added 4 commits May 2, 2026 14:14
Introduces a declarative spec for bulk-delete (prune) operations parallel
to the existing Creator/Updater/Purger patterns:

- PrunerSpec[TRow] abstract base with row_class/returning_id/prune_condition
  classmethods plus per-call conditions/cascade/limit/cascade_rbac fields.
- CascadeChild abstraction for FK-dependent child tables (e.g., kernels of
  pruned sessions) deleted within the same transaction.
- entity_type() classmethod opts the spec into RBAC association cleanup
  via association_scopes_entities (cross-cutting, built into the executor).
- execute_pruner runs SELECT pk FOR UPDATE LIMIT once, materializes IDs,
  then issues cascade DELETEs and the parent DELETE...RETURNING from the
  same locked set — race-safe and avoids re-evaluating the parent SELECT
  per cascade.
- DEFAULT_PRUNE_LIMIT (100k) caps lock set, memory, and transaction
  duration; operators run multiple calls to drain larger backlogs.

Includes integration tests covering basic prune, runtime conditions,
limit, FK cascade, RBAC cleanup with entity_type filtering, and the
default no-op (entity_type=None) path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Wrap all DELETEs (cascade, RBAC, parent) in a single try/except for
  IntegrityError so cascade failures also surface as
  RepositoryIntegrityError instead of raw SQLAlchemy errors.
- Drop PrunerSpec.returning_id() — derive the parent PK column directly
  from row_class().__table__.primary_key, mirroring the Purger pattern.
  Reject composite-PK tables with UnsupportedCompositePrimaryKeyError.
- Update CascadeChild docstring to reflect the materialized-id-list
  approach (the previous SQL-subquery wording was stale).
- Tighten test_no_cascade_with_fk_violation_raises to expect
  ForeignKeyViolationError, also covering the parse_integrity_error path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Parameterize CascadeChild on the cascade table's Row class
(``CascadeChild[TRow: Base]``) for consistency with PrunerSpec[TRow]
and to give subclass authors precise typing on ``row_class()`` —
e.g., ``CascadeChild[KernelRow]`` makes ``row_class()`` return
``type[KernelRow]`` instead of the over-broad ``type[Base]``.

PrunerSpec.cascade is typed as ``list[CascadeChild[Any]]`` since a
spec composes cascades over heterogeneous tables.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@fregataa fregataa force-pushed the feat/BA-5936-pruner-abstraction branch from 8fb7909 to 2324c77 Compare May 2, 2026 05:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp:manager Related to Manager component size:XL 500~ LoC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants