Skip to content

[release] v0.103.3#4646

Merged
bekossy merged 73 commits into
mainfrom
release/v0.103.3
Jun 12, 2026
Merged

[release] v0.103.3#4646
bekossy merged 73 commits into
mainfrom
release/v0.103.3

Conversation

@github-actions

Copy link
Copy Markdown
Contributor

New version v0.103.3 in

  • web
    • web/oss
    • web/ee
  • services
  • api
  • sdks
    • sdks/python
  • clients
    • clients/python
    • clients/typescript
  • kubernetes
    • kubernetes/helm

ardaerzin and others added 20 commits June 7, 2026 18:41
- Shared 'Edit evaluation' drawer (name/description + evaluators) opened from a run-header
  actions dropdown (all tabs), the config General 'Edit' button, and the evaluations-table
  row action; the config General section is now display-only.
- Jotai mutation flow (editSimpleEvaluation + process slice) with a terminal-gated
  background refresh so the evaluations list and the run scenarios table converge reliably
  (columns, metric cells, status) after an edit.
- Resolve evaluator output metrics for staged (pending) evaluators in the drawer.
- Dark mode fixes: drawer edge shadow, entity-picker hover/selected highlight, and the
  cascader child-panel loading/loaded width jump.
dispatch_run_slice re-activates the run (status=RUNNING, is_active=True) before dispatching
the worker, so the status indicator reflects the reprocess; _finalize_run_after_slice floors
it back to terminal when scoring completes. Adds an acceptance probe for the edit+process path.
Link ids recovered from stored result cells on the re-run/process path arrive as dashed
UUIDs (live spans send bare hex); both encode the same integer. Strip dashes before base-16
parsing so add_link no longer raises ValueError on the hyphens.
Drops the exploratory acceptance probe added alongside the run-status change; it was a
proof-of-contract probe, not a maintained test.
… slice

Mirrors the run-level re-activation at the scenario level so per-scenario status
indicators also reflect the reprocess; dispatch_run_slice now bulk-sets the addressed
scenarios to RUNNING/is_active before dispatch (full-PUT edit preserves flags/interval/
timestamp/meta), and the engine writes each scenario's terminal status back on completion.
…ding an evaluator

The post-edit background refresh now (1) matches any query scoped to the run id
(reload-equivalent — covers the scenario rows+status query the old allowlist missed),
(2) detects run completion authoritatively via the run batcher instead of getQueryData,
and (3) invalidates twice (now + a short settle) so cell results that persist just after
the run status flips terminal aren't left frozen by the per-scenario poller.
Fixes #4591

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add a self-serve "Delete account" flow so users can remove their own
account instead of asking support. EE-only (cloud and self-hosted EE);
OSS keeps its shared singleton org and does not expose the route.

DELETE /profile deletes the caller, the organizations they solely own,
their SuperTokens login, their Stripe subscription, and their Loops
contact, in that order. SuperTokens is deleted before the DB cascade so
the idempotent signup override cannot recreate the account on next login.
If the user owns an org with other members, the request is blocked (409)
rather than deleting the team's data.

Reuses the existing admin cascade (admin_delete_user_with_cascade +
membership cleanup). New pieces: emailing.delete_contact, a HTTP-free
SubscriptionsService.cancel_stripe_subscription, count_organization_members,
and PlatformAdminAccountsService.delete_own_account. Frontend adds an
EE-only Account tab in Settings with a type-your-email confirm modal that
signs the user out on success.

Acceptance tests (happy path + shared-org block) pass against the EE stack.
The testcase row store is a global singleton shared across loadables,
while playground loadables are keyed per anchor revision. Navigating
app to app only cleared the node selection, so the previous app's draft
rows (and a connected test set's server rows) stayed in the global
store and rendered verbatim in the next app's playground. linkToRunnable
also skipped seeding the fresh empty row because the store was
non-empty. The leak has existed since the playground moved to the
loadable architecture in February 2026.

App changes in bindRevisionsReady now dispatch a thin global reset
(loadableController.actions.resetRowsForAppSwitch) before clearing the
selection: delete all rows, reset the server/new/deleted id atoms, and
null currentRevisionId so the paginated testcase query cannot re-append
the previous app's connected rows. The next linkToRunnable then seeds
the canonical single empty row. Same-app re-entry does not hit the
appChanged guard, so drafts still survive playground - registry -
playground navigation within one app.

Verified in Chrome against the dev stack (completion to chat, chat to
completion, same-app re-entry, no console errors) plus five new unit
tests; full entities suite green.
@vercel

vercel Bot commented Jun 11, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Jun 12, 2026 6:54pm

Request Review

@dosubot dosubot Bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Jun 11, 2026
jp-agenta and others added 7 commits June 11, 2026 15:24
…h Run

The pre-run strict row clean (#4525) deleted synced test set columns the
prompt does not reference, emptying the unused-columns footer the moment
the user clicked Run. Protect the columns present in the testcase's
server snapshot (minus chat transport keys) in both persisting cleans:
the run-time reconcile and the app-swap prune. Locally accumulated stale
keys are absent from the server snapshot, so the #4525 cleanup still
applies.

Fixes #4647
Committing a draft replaces the anchor revision id, which renames the
derived loadable id (testset:workflow:<revisionId>). The session relink
(AGE-3785) already moved chat history and execution results to the new
key but left the test set connection behind, so every commit silently
unsynced the test set: the dropdown showed unsynced, the unused-columns
footer (connection-gated) vanished, and the URL snapshot downgraded the
rows to a local test set, which also disabled the server-snapshot
protection added for #4647.

Migrate connectedSourceId/Name/Type, hiddenTestcaseIds and activeRowId
to the new loadable id in the same pass that moves executionResults.
bekossy and others added 2 commits June 12, 2026 15:49
Resolved conflict in playgroundController.ts: kept the release branch's
more complete loadable state migration — spreads all state fields (including
disabledOutputMappingRowIds) and evicts the old family key via
loadableStateAtomFamily.remove, which supersedes the surgical HEAD approach.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rchival-flows

[Feat]: Add bulk-archive actions for apps, testsets and evaluators
@dosubot dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jun 12, 2026
…r-columns

[fix] Keep unused test set columns on playground rows through Run
mmabrouk and others added 7 commits June 12, 2026 18:13
Account deletion 500'd for users who had accepted an invitation into
another organization. Accepting an invite stamps the host org's
project_invitations.user_id, and that FK (like the modified_by_id audit
columns and webhook_subscriptions.created_by_id) has no ON DELETE rule.
The host org survives the cascade, so DELETE FROM users hit a foreign
key violation after SuperTokens and Stripe were already processed, and
the frontend never ran the logout flow.

Clear those references inside the same transaction before the user row
is deleted: drop the user's invitation rows and webhook subscriptions
(created_by_id is NOT NULL), and null the modified_by_id audit columns.
This also fixes the same latent bug in the admin delete path.

Regression test drives the real invite, accept, delete flow.
…maless-evaluators

[fix] Resolve broken filtering for schemaless `evaluators`
@dosubot dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Jun 12, 2026
bekossy added 2 commits June 12, 2026 20:38
…-labels

fix(frontend): label evaluator nodes in the playground by their entity name
fix(frontend): resolve evaluation names for SDK-created apps and new evaluators
bekossy added 2 commits June 12, 2026 20:51
…sting-eval

[FE / Feat] Add evaluators to existing evals
@bekossy bekossy merged commit 28f5007 into main Jun 12, 2026
27 of 31 checks passed
@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants