[release] v0.103.3#4646
Merged
Merged
Conversation
- Shared 'Edit evaluation' drawer (name/description + evaluators) opened from a run-header actions dropdown (all tabs), the config General 'Edit' button, and the evaluations-table row action; the config General section is now display-only. - Jotai mutation flow (editSimpleEvaluation + process slice) with a terminal-gated background refresh so the evaluations list and the run scenarios table converge reliably (columns, metric cells, status) after an edit. - Resolve evaluator output metrics for staged (pending) evaluators in the drawer. - Dark mode fixes: drawer edge shadow, entity-picker hover/selected highlight, and the cascader child-panel loading/loaded width jump.
dispatch_run_slice re-activates the run (status=RUNNING, is_active=True) before dispatching the worker, so the status indicator reflects the reprocess; _finalize_run_after_slice floors it back to terminal when scoring completes. Adds an acceptance probe for the edit+process path.
Link ids recovered from stored result cells on the re-run/process path arrive as dashed UUIDs (live spans send bare hex); both encode the same integer. Strip dashes before base-16 parsing so add_link no longer raises ValueError on the hyphens.
Drops the exploratory acceptance probe added alongside the run-status change; it was a proof-of-contract probe, not a maintained test.
… slice Mirrors the run-level re-activation at the scenario level so per-scenario status indicators also reflect the reprocess; dispatch_run_slice now bulk-sets the addressed scenarios to RUNNING/is_active before dispatch (full-PUT edit preserves flags/interval/ timestamp/meta), and the engine writes each scenario's terminal status back on completion.
…ding an evaluator The post-edit background refresh now (1) matches any query scoped to the run id (reload-equivalent — covers the scenario rows+status query the old allowlist missed), (2) detects run completion authoritatively via the run batcher instead of getQueryData, and (3) invalidates twice (now + a short settle) so cell results that persist just after the run status flips terminal aren't left frozen by the per-scenario poller.
Fixes #4591 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… lock to prevent duplicates
Add a self-serve "Delete account" flow so users can remove their own account instead of asking support. EE-only (cloud and self-hosted EE); OSS keeps its shared singleton org and does not expose the route. DELETE /profile deletes the caller, the organizations they solely own, their SuperTokens login, their Stripe subscription, and their Loops contact, in that order. SuperTokens is deleted before the DB cascade so the idempotent signup override cannot recreate the account on next login. If the user owns an org with other members, the request is blocked (409) rather than deleting the team's data. Reuses the existing admin cascade (admin_delete_user_with_cascade + membership cleanup). New pieces: emailing.delete_contact, a HTTP-free SubscriptionsService.cancel_stripe_subscription, count_organization_members, and PlatformAdminAccountsService.delete_own_account. Frontend adds an EE-only Account tab in Settings with a type-your-email confirm modal that signs the user out on success. Acceptance tests (happy path + shared-org block) pass against the EE stack.
The testcase row store is a global singleton shared across loadables, while playground loadables are keyed per anchor revision. Navigating app to app only cleared the node selection, so the previous app's draft rows (and a connected test set's server rows) stayed in the global store and rendered verbatim in the next app's playground. linkToRunnable also skipped seeding the fresh empty row because the store was non-empty. The leak has existed since the playground moved to the loadable architecture in February 2026. App changes in bindRevisionsReady now dispatch a thin global reset (loadableController.actions.resetRowsForAppSwitch) before clearing the selection: delete all rows, reset the server/new/deleted id atoms, and null currentRevisionId so the paginated testcase query cannot re-append the previous app's connected rows. The next linkToRunnable then seeds the canonical single empty row. Same-app re-entry does not hit the appChanged guard, so drafts still survive playground - registry - playground navigation within one app. Verified in Chrome against the dev stack (completion to chat, chat to completion, same-app re-entry, no console errors) plus five new unit tests; full entities suite green.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
…h Run The pre-run strict row clean (#4525) deleted synced test set columns the prompt does not reference, emptying the unused-columns footer the moment the user clicked Run. Protect the columns present in the testcase's server snapshot (minus chat transport keys) in both persisting cleans: the run-time reconcile and the app-swap prune. Locally accumulated stale keys are absent from the server snapshot, so the #4525 cleanup still applies. Fixes #4647
Committing a draft replaces the anchor revision id, which renames the derived loadable id (testset:workflow:<revisionId>). The session relink (AGE-3785) already moved chat history and execution results to the new key but left the test set connection behind, so every commit silently unsynced the test set: the dropdown showed unsynced, the unused-columns footer (connection-gated) vanished, and the URL snapshot downgraded the rows to a local test set, which also disabled the server-snapshot protection added for #4647. Migrate connectedSourceId/Name/Type, hiddenTestcaseIds and activeRowId to the new loadable id in the same pass that moves executionResults.
Resolved conflict in playgroundController.ts: kept the release branch's more complete loadable state migration — spreads all state fields (including disabledOutputMappingRowIds) and evicts the old family key via loadableStateAtomFamily.remove, which supersedes the surgical HEAD approach. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rchival-flows [Feat]: Add bulk-archive actions for apps, testsets and evaluators
…r-columns [fix] Keep unused test set columns on playground rows through Run
Account deletion 500'd for users who had accepted an invitation into another organization. Accepting an invite stamps the host org's project_invitations.user_id, and that FK (like the modified_by_id audit columns and webhook_subscriptions.created_by_id) has no ON DELETE rule. The host org survives the cascade, so DELETE FROM users hit a foreign key violation after SuperTokens and Stripe were already processed, and the frontend never ran the logout flow. Clear those references inside the same transaction before the user row is deleted: drop the user's invitation rows and webhook subscriptions (created_by_id is NOT NULL), and null the modified_by_id audit columns. This also fixes the same latent bug in the admin delete path. Regression test drives the real invite, accept, delete flow.
…uator-node-labels
…kflow-name-selectors
…maless-evaluators [fix] Resolve broken filtering for schemaless `evaluators`
feat(api): Self-serve account deletion from Settings
…-labels fix(frontend): label evaluator nodes in the playground by their entity name
fix(frontend): resolve evaluation names for SDK-created apps and new evaluators
Refactor/workflow name selectors
…sting-eval [FE / Feat] Add evaluators to existing evals
bekossy
approved these changes
Jun 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
New version v0.103.3 in