[release] v0.103.3 by github-actions[bot] · Pull Request #4646 · Agenta-AI/agenta

github-actions · 2026-06-11T13:16:54Z

New version v0.103.3 in

web
- web/oss
- web/ee
services
api
sdks
- sdks/python
clients
- clients/python
- clients/typescript
kubernetes
- kubernetes/helm

- Shared 'Edit evaluation' drawer (name/description + evaluators) opened from a run-header actions dropdown (all tabs), the config General 'Edit' button, and the evaluations-table row action; the config General section is now display-only. - Jotai mutation flow (editSimpleEvaluation + process slice) with a terminal-gated background refresh so the evaluations list and the run scenarios table converge reliably (columns, metric cells, status) after an edit. - Resolve evaluator output metrics for staged (pending) evaluators in the drawer. - Dark mode fixes: drawer edge shadow, entity-picker hover/selected highlight, and the cascader child-panel loading/loaded width jump.

dispatch_run_slice re-activates the run (status=RUNNING, is_active=True) before dispatching the worker, so the status indicator reflects the reprocess; _finalize_run_after_slice floors it back to terminal when scoring completes. Adds an acceptance probe for the edit+process path.

Link ids recovered from stored result cells on the re-run/process path arrive as dashed UUIDs (live spans send bare hex); both encode the same integer. Strip dashes before base-16 parsing so add_link no longer raises ValueError on the hyphens.

Drops the exploratory acceptance probe added alongside the run-status change; it was a proof-of-contract probe, not a maintained test.

…-existing-eval

… slice Mirrors the run-level re-activation at the scenario level so per-scenario status indicators also reflect the reprocess; dispatch_run_slice now bulk-sets the addressed scenarios to RUNNING/is_active before dispatch (full-PUT edit preserves flags/interval/ timestamp/meta), and the engine writes each scenario's terminal status back on completion.

…ding an evaluator The post-edit background refresh now (1) matches any query scoped to the run id (reload-equivalent — covers the scenario rows+status query the old allowlist missed), (2) detects run completion authoritatively via the run batcher instead of getQueryData, and (3) invalidates twice (now + a short settle) so cell results that persist just after the run status flips terminal aren't left frozen by the per-scenario poller.

Fixes #4591 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… revision commit

…-existing-eval

… lock to prevent duplicates

Add a self-serve "Delete account" flow so users can remove their own account instead of asking support. EE-only (cloud and self-hosted EE); OSS keeps its shared singleton org and does not expose the route. DELETE /profile deletes the caller, the organizations they solely own, their SuperTokens login, their Stripe subscription, and their Loops contact, in that order. SuperTokens is deleted before the DB cascade so the idempotent signup override cannot recreate the account on next login. If the user owns an org with other members, the request is blocked (409) rather than deleting the team's data. Reuses the existing admin cascade (admin_delete_user_with_cascade + membership cleanup). New pieces: emailing.delete_contact, a HTTP-free SubscriptionsService.cancel_stripe_subscription, count_organization_members, and PlatformAdminAccountsService.delete_own_account. Frontend adds an EE-only Account tab in Settings with a type-your-email confirm modal that signs the user out on success. Acceptance tests (happy path + shared-org block) pass against the EE stack.

The testcase row store is a global singleton shared across loadables, while playground loadables are keyed per anchor revision. Navigating app to app only cleared the node selection, so the previous app's draft rows (and a connected test set's server rows) stayed in the global store and rendered verbatim in the next app's playground. linkToRunnable also skipped seeding the fresh empty row because the store was non-empty. The leak has existed since the playground moved to the loadable architecture in February 2026. App changes in bindRevisionsReady now dispatch a thin global reset (loadableController.actions.resetRowsForAppSwitch) before clearing the selection: delete all rows, reset the server/new/deleted id atoms, and null currentRevisionId so the paginated testcase query cannot re-append the previous app's connected rows. The next linkToRunnable then seeds the canonical single empty row. Same-app re-entry does not hit the appChanged guard, so drafts still survive playground - registry - playground navigation within one app. Verified in Chrome against the dev stack (completion to chat, chat to completion, same-app re-entry, no console errors) plus five new unit tests; full entities suite green.

vercel · 2026-06-11T13:16:59Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	Jun 12, 2026 6:54pm

…h Run The pre-run strict row clean (#4525) deleted synced test set columns the prompt does not reference, emptying the unused-columns footer the moment the user clicked Run. Protect the columns present in the testcase's server snapshot (minus chat transport keys) in both persisting cleans: the run-time reconcile and the app-swap prune. Locally accumulated stale keys are absent from the server snapshot, so the #4525 cleanup still applies. Fixes #4647

Committing a draft replaces the anchor revision id, which renames the derived loadable id (testset:workflow:<revisionId>). The session relink (AGE-3785) already moved chat history and execution results to the new key but left the test set connection behind, so every commit silently unsynced the test set: the dropdown showed unsynced, the unused-columns footer (connection-gated) vanished, and the URL snapshot downgraded the rows to a local test set, which also disabled the server-snapshot protection added for #4647. Migrate connectedSourceId/Name/Type, hiddenTestcaseIds and activeRowId to the new loadable id in the same pass that moves executionResults.

Resolved conflict in playgroundController.ts: kept the release branch's more complete loadable state migration — spreads all state fields (including disabledOutputMappingRowIds) and evicts the old family key via loadableStateAtomFamily.remove, which supersedes the surgical HEAD approach. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…rchival-flows [Feat]: Add bulk-archive actions for apps, testsets and evaluators

…r-columns [fix] Keep unused test set columns on playground rows through Run

Account deletion 500'd for users who had accepted an invitation into another organization. Accepting an invite stamps the host org's project_invitations.user_id, and that FK (like the modified_by_id audit columns and webhook_subscriptions.created_by_id) has no ON DELETE rule. The host org survives the cascade, so DELETE FROM users hit a foreign key violation after SuperTokens and Stripe were already processed, and the frontend never ran the logout flow. Clear those references inside the same transaction before the user row is deleted: drop the user's invitation rows and webhook subscriptions (created_by_id is NOT NULL), and null the modified_by_id audit columns. This also fixes the same latent bug in the admin delete path. Regression test drives the real invite, accept, delete flow.

…uator-node-labels

…kflow-name-selectors

…ess-evaluators

…maless-evaluators [fix] Resolve broken filtering for schemaless `evaluators`

…ng-eval

feat(api): Self-serve account deletion from Settings

…-labels fix(frontend): label evaluator nodes in the playground by their entity name

fix(frontend): resolve evaluation names for SDK-created apps and new evaluators

Refactor/workflow name selectors

…sting-eval [FE / Feat] Add evaluators to existing evals

ardaerzin and others added 20 commits June 7, 2026 18:41

chore(api): remove add-evaluators edit-path acceptance probe

a8b2422

Drops the exploratory acceptance probe added alongside the run-status change; it was a proof-of-contract probe, not a maintained test.

Merge branch 'feat/unified-eval-loops' into fe-feat/add-evaluators-to…

ecafe67

…-existing-eval

fix(frontend): preserve connected testset when opening Manage testcases

d8cc7ce

Fixes #4591 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Merge branch 'main' into fe-fix/4591-manage-testcases-crash

67f43ef

fix(playground): ensure connected testset remains linked after prompt…

fd20e95

… revision commit

Merge branch 'feat/unified-eval-loops' into fe-feat/add-evaluators-to…

2961145

…-existing-eval

feat(frontend): add slug column to workflow tables

36c8cf7

[#4620] fix(api): assign revision versions pre-insert under a variant…

88bd5d4

… lock to prevent duplicates

fix(frontend): wire evaluator bulk archive selection

0549305

fix(frontend): shorten bulk archive labels

16c7307

Merge branch 'release/v0.103.2' into fe-fix/4591-manage-testcases-crash

2aceb95

[fix] Resolve broken scenario status in annotation queues

09fe32f

v0.103.3

ba5913a

dosubot Bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Jun 11, 2026

vercel Bot deployed to Preview June 11, 2026 13:18 View deployment

jp-agenta and others added 7 commits June 11, 2026 15:24

fix CR

4ddd94e

[fix] Resolve broken filtering for schemaless evaluators

1110bce

fix CR

9d8b0aa

[fix] Resolve broken references in playground

be9224a

fix CR

9219dec

bekossy and others added 2 commits June 12, 2026 15:49

Merge pull request #4629 from Agenta-AI/code/update-app-and-testset-a…

85b69f6

…rchival-flows [Feat]: Add bulk-archive actions for apps, testsets and evaluators

dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jun 12, 2026

vercel Bot deployed to Preview June 12, 2026 14:09 View deployment

Merge pull request #4649 from Agenta-AI/wip/playground-testcase-serve…

903e3e1

…r-columns [fix] Keep unused test set columns on playground rows through Run

vercel Bot deployed to Preview June 12, 2026 14:37 View deployment

mmabrouk and others added 7 commits June 12, 2026 18:13

Merge branch 'release/v0.103.3' into fix/evaluation-name-fallbacks

d796de5

Merge branch 'fix/evaluation-name-fallbacks' into fix/playground-eval…

feb7ab7

…uator-node-labels

Merge branch 'fix/playground-evaluator-node-labels' into refactor/wor…

aac6fa9

…kflow-name-selectors

Merge branch 'release/v0.103.3' into fix/broken-filtering-for-schemal…

dba2d8b

…ess-evaluators

Merge branch 'release/v0.103.3' into feat/self-serve-account-deletion

6ca23a6

Merge pull request #4650 from Agenta-AI/fix/broken-filtering-for-sche…

a4770cf

…maless-evaluators [fix] Resolve broken filtering for schemaless `evaluators`

vercel Bot deployed to Preview June 12, 2026 18:13 View deployment

bekossy and others added 3 commits June 12, 2026 20:13

Merge branch 'release/v0.103.3' into fe-feat/add-evaluators-to-existi…

106c826

…ng-eval

Fix account deletion review feedback

a38ed9f

Merge pull request #4600 from Agenta-AI/feat/self-serve-account-deletion

fa31f06

feat(api): Self-serve account deletion from Settings

dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Jun 12, 2026

vercel Bot deployed to Preview June 12, 2026 18:24 View deployment

bekossy added 2 commits June 12, 2026 20:38

Merge pull request #4668 from Agenta-AI/fix/playground-evaluator-node…

f7baff7

…-labels fix(frontend): label evaluator nodes in the playground by their entity name

Merge pull request #4661 from Agenta-AI/fix/evaluation-name-fallbacks

de32877

fix(frontend): resolve evaluation names for SDK-created apps and new evaluators

vercel Bot deployed to Preview June 12, 2026 18:45 View deployment

bekossy added 2 commits June 12, 2026 20:51

Merge pull request #4685 from Agenta-AI/refactor/workflow-name-selectors

ea0e6cd

Refactor/workflow name selectors

Merge pull request #4577 from Agenta-AI/fe-feat/add-evaluators-to-exi…

6e8b589

…sting-eval [FE / Feat] Add evaluators to existing evals

vercel Bot deployed to Preview June 12, 2026 18:54 View deployment

bekossy approved these changes Jun 12, 2026

View reviewed changes

bekossy merged commit 28f5007 into main Jun 12, 2026
27 of 31 checks passed

dosubot Bot added the lgtm This PR has been approved by a maintainer label Jun 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[release] v0.103.3#4646

[release] v0.103.3#4646
bekossy merged 73 commits into
mainfrom
release/v0.103.3

github-actions Bot commented Jun 11, 2026

Uh oh!

vercel Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

github-actions Bot commented Jun 11, 2026

Uh oh!

vercel Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

vercel Bot commented Jun 11, 2026 •

edited

Loading