Skip to content

Better Auth + dynamic scoped RBAC (single cohesive change)#848

Open
lane711 wants to merge 22 commits into
mainfrom
feature/better-auth-poc
Open

Better Auth + dynamic scoped RBAC (single cohesive change)#848
lane711 wants to merge 22 commits into
mainfrom
feature/better-auth-poc

Conversation

@lane711

@lane711 lane711 commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

Replaces SonicJS's homegrown JWT/role auth with Better Auth (via the better-auth-cloudflare shim) and a dynamic, UI-editable, scoped RBAC system. These pieces are interdependent and must ship together — this PR bundles the whole set.

What's included (6 commits)

  1. feat: add better auth and dynamic RBAC — Better Auth mounted at /auth/* via the CF shim, reusing the existing users table (no FK rewrite). DB-backed sessions + KV secondary storage. Legacy PBKDF2 hashes verified and transparently upgraded to scrypt on first login. Dynamic RBAC data model (rbac_roles, rbac_verbs, rbac_role_grants, rbac_user_roles) + matrix admin UI at /admin/rbac.
  2. fix: give custom RBAC roles unique colors — per-role generated HSL colors in the matrix.
  3. fix: allow empty RBAC role comparison — deselecting all roles is preserved instead of re-selecting all.
  4. feat: gate backend access with RBAC/admin/* now requires portal:access (migration 039) instead of the legacy users.role; old role dropdown/filter UI removed from user create/edit/list.
  5. feat: add scoped RBAC permissions — grants gain none|own|any scope (migration 040). own enforces author_id ownership on content list/edit/update/delete/bulk and the JSON content API.
  6. feat: enforce RBAC across admin/plugin routes + add RBAC test coverage — migrate the remaining 41 user.role === 'admin' checks (15 files) to requireRbac/rbac.can; fix seed-admin to create the Better Auth credential account; add RBAC unit/integration/e2e tests; remove stale legacy-role tests.

Auth model

  • Login is email-based through Better Auth; sessions are DB-backed (KV-cached, ~7–19ms p99 with cache vs ~166ms without).
  • Portal entry = any assigned RBAC role granting portal:access (admin has *:manage).
  • Permissions = role × resource × verb × {none,own,any}; resources are the system set + one collection:<name> per collection, with *, collection:*, and manage wildcards.

Tests

  • Unit: RbacService wildcard/manage/scope semantics, role/verb mutation safety, getResources.
  • Middleware: requireRbac + portal:access (401/403/redirect, correct can() wiring).
  • Integration: content-API own-scope enforcement (non-owner blocked, any ignores ownership).
  • E2E: /admin/rbac matrix renders + save round-trip + admin column locked.
  • Full core suite green except 2 pre-existing failing files unrelated to this work (auth.test.ts requireAuth/optionalAuth mocks; one api-public-content-status test).

Known caveats (POC scope)

  • type-check still fails only on the pre-existing email-templates Zod validator types (untouched here).
  • CSRF middleware was made Better-Auth-session-aware; full CSRF enforcement hardening is a flagged follow-up.
  • Legacy compatibility kept intentionally: users.role column and the requireRole export (downstream API). Removal tracked separately (username cleanup filed as Remove vestigial username field from auth/users (collision footgun, no functional purpose) #847).
  • Social/2FA/magic-link not wired in this PR (shim path proven; deferred).

Breaking changes

  • JWT_SECRETBETTER_AUTH_SECRET + BETTER_AUTH_URL.
  • Backend access is now RBAC-gated (portal:access), not users.role.

🤖 Generated with Claude Code

lane711 and others added 6 commits June 2, 2026 17:46
- Add Better Auth integration, schema, migrations, and D1 local compatibility support
- Add dynamic RBAC roles, permissions matrix, user role assignment, and admin UI tabs
- Add side-by-side role comparison with per-role color coding and wildcard cascade behavior
- Update generated core dist artifacts and migration bundle

Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
- Keep system role colors predictable
- Generate distinct HSL colors for custom roles in the comparison matrix
- Update generated core dist artifacts

Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
- Preserve explicit empty role selection with compare=1
- Keep first RBAC page visit defaulting to all roles
- Preserve empty comparison state after permission save
- Update generated core dist artifacts

Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
- Require portal:access for /admin routes instead of legacy users.role
- Replace remaining production requireRole gates with RBAC resource permissions
- Remove role dropdown/filter UI from user create, edit, and list pages
- Assign Better Auth, legacy auth, seed, invite, and manual users into RBAC roles
- Add portal access migration and generated core dist artifacts

Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
- Add own/any permission scopes to RBAC grants
- Replace boolean permission matrix cells with scoped selectors
- Enforce own-content access using content.author_id on admin and JSON content routes
- Add migration for scoped grants and focused RBAC scope tests
- Update generated core migration bundle and dist artifacts

Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
Migrate the legacy `user.role === 'admin'` access checks to dynamic RBAC so
the role/permission matrix actually governs the backend, and add the missing
test coverage for the RBAC engine.

- Replace 41 inline `user.role === 'admin'` checks across 15 route/plugin
  files with RBAC: plugins:manage, settings:manage, content:manage, and
  media owner-or-`media:* any` scope. Admin retains access via *:manage.
- Fix seed-admin: create the Better Auth credential `account` row so the
  seeded admin can sign in under Better Auth (previously rejected — the
  PBKDF2 users row had no matching account credential).
- Add tests: RbacService wildcard/manage/scope semantics, requireRbac +
  portal:access gate, content-API own-scope enforcement, and an e2e for
  the /admin/rbac matrix save round-trip.
- Remove the misnamed legacy `rbac-enforcement` test (tested old requireRole)
  and the vacuous role-pattern block in admin-content-preview; update the
  admin-plugins/admin-settings route tests to drive RBAC.

The users.role column and the requireRole export are kept as compatibility.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Users created outside the Better Auth sign-up flow (legacy /auth/register,
admin-created, or seeded users) have a users.password_hash but no Better Auth
`account` credential row, so /auth/login/form failed with "Credential account
not found". The browser login now detects this case, backfills the credential
account from the stored hash, and retries once; the existing PBKDF2->scrypt
verify hook then upgrades the legacy hash on the successful login.

Mirrors migration 037's one-time backfill and the seed-admin fix, but applies
lazily at login so it also covers users created after migration time.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the per-cell <select> in the /admin/rbac matrix with a 3-position
segmented pill switch (hidden radios + styled segments, white "thumb" on the
selected one). Form submission, the "All resources" cascade lock, and the
POST /admin/rbac/grants parser are unchanged — the radios carry the same
name/value/data-* attributes the <select> did, so only the rendering and the
cascade script (now driving radios) changed. Cells without Own support render
a 2-way None/Any switch; admin cells stay locked.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…Clear)

/admin/rbac was too wide with every role shown. Now:
- Default view shows only the first 3 roles (was: all). Clear/empty selection
  still respected.
- None/Own/Any segmented switch is stacked vertically (up/down) instead of
  horizontal, shrinking column width.
- Role names in the header are rotated to vertical writing-mode, further
  reducing per-column width.
- Added a "Clear" button to the right of the role selector that deselects all
  roles in one click (links to ?compare=1 with no roles).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The left nav previously rendered every section for every user; RBAC was only
enforced at the route level, so a restricted user saw links that just bounced.

Now an /admin/* middleware injects the signed-in user's effective permission
set (RbacService.permissionsForUser) into the page, the layout tags each nav
anchor with the permission that section needs (data-perm), and a small inline
script hides anchors the user lacks. Mapping: Collections/Settings/Cache ->
settings|collections:manage, Users -> users:manage, Plugins -> plugins:manage,
Content/Forms -> content:read, Media -> media:read; Dashboard always shown.

Presentation only — every route stays gated by requireRbac, and if the perm
set fails to inject the nav simply shows all items (no regression).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the labeled None/Own/Any segments with a compact vertical range: three
colored stops (Any=green, Own=amber, None=gray) stacked top→bottom (most→least
access), with no per-cell text. A color legend is shown once in the matrix
header. Hidden radios still carry none/own/any, so form submission, the cascade
lock, and the POST parser are unchanged; colors are applied via a scoped .scope
<style> block (inline styles can't target :checked).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e plan

Acts on the auth/RBAC deep-dive vs Mark's infowall fork. Front-loads the two
verified credential-correctness blockers and the secret hygiene before any
feature work, and lands the tracked convergence plan.

Phase 1 (credential correctness — BLOCKERS):
- Password reset and invitation-accept now write through to the Better Auth
  credential account (account.password) via ensureCredentialAccount, not just
  users.password_hash. BA verifies account.password, and the login self-heal
  only fires when no account row exists — so previously every reset/invite for
  an existing user was a GUARANTEED lockout. Verified end-to-end: post-reset the
  new password logs in and the old one is rejected.
- Scope the PBKDF2->scrypt upgrade UPDATE to provider_id='credential'. The full
  fix (key on cred-<userId>) is blocked by Better Auth's password.verify hook not
  exposing the user id; tracked as a follow-up. Random 16-byte salts make
  cross-account collision impractical today.

Phase 0 (secret hygiene):
- Remove the committed BETTER_AUTH_SECRET from wrangler.toml ([vars] and
  [env.preview.vars]); local dev reads it from a gitignored .dev.vars, prod/
  preview from `wrangler secret put`. The previously-committed value must be
  ROTATED (it is in git history).
- Hard-fail in createAuth() if BETTER_AUTH_SECRET is missing/short instead of
  silently signing sessions with an undefined secret.
- /auth/seed-admin no longer returns the password hash in JSON and is disabled
  when ENVIRONMENT=production.

Docs:
- docs/ai/plans/auth-rbac-convergence-plan.md — comparison chart + phased path
  forward (BA identity engine + dynamic RBAC, hardened to Mark's standard),
  reordered so credential-write-through and users.role single-source-of-truth
  are Phase-1 blockers ahead of feature work.

Patterns, hardening targets, and the lockout analysis are derived from Mark's
infowall auth/RBAC work; co-authored to credit that.

Co-Authored-By: Mark (Infowall) <security@infowall.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Makes rbac_user_roles the single source of truth for authorization and demotes
the legacy users.role column to a projection that can no longer diverge.

- RbacService.setUserRoles() (the one role-mutation chokepoint, used by the admin
  user create + edit routes) now recomputes users.role in the same D1 batch as a
  precedence-ordered projection of the assigned RBAC roles:
  admin > editor > author > viewer; custom-only roles project to 'viewer'; no
  roles projects to 'viewer'. Authorization never reads this string — it exists
  only for compat (older queries, session shape).
- The admin user-edit route's direct UPDATE users no longer writes `role`, so the
  projection in setUserRoles is the sole writer on that path. Creation paths
  (register/seed/BA-create-hook/invite) already write users.role and
  rbac_user_roles consistently, so no divergence path remains. Role deletion can't
  invalidate the projection (custom roles never become it; system roles can't be
  deleted).
- Tests: 3 unit tests (highest-precedence wins, custom-only -> viewer, empty ->
  viewer with role-clear). 24 rbac.test.ts tests pass; build clean.

Physically dropping users.role (mark's migration-117 model) is deferred until the
remaining read sites (/auth/me, /auth/refresh, session role) are migrated; the
column now stays as a safe read-through projection. Plan doc updated.

Co-Authored-By: Mark (Infowall) <security@infowall.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Locks in the Phase 1 fix so it can't silently regress. Registers a legacy user,
logs in once (asserting the login self-heal creates the Better Auth credential
account), resets the password, then asserts the NEW password logs in and the OLD
one is rejected. Before the fix, step 5 returned "Invalid email or password" —
a guaranteed lockout on every reset for an existing user.

Verified green against the local build (1 passed).

Co-Authored-By: Mark (Infowall) <security@infowall.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Phase 3 (start) + completes Phase 1 credential correctness.

Self-lockout guard:
- RbacService.setUserRoles now refuses a change that would leave zero ACTIVE
  users holding BOTH an effective portal:access and rbac:manage grant (the
  minimum needed to recover from a lockout). New RbacService.countPortalAdmins()
  resolves this via grantMatches, so a renamed/custom Administrator role with
  *:manage still counts. The admin user-edit route surfaces the refusal as an
  inline error instead of 500. Unit-tested (blocks last admin; allows when
  another admin exists; allows when the edited user keeps portal+rbac).

Credential write-through (completes the reset-lockout class of bug):
- ensureCredentialAccount is now a shared AuthManager static (single impl);
  routes/auth.ts delegates to it.
- Admin "set user password" on the edit page and admin user-create now sync the
  Better Auth credential account (account.password). Previously an admin-set
  password locked the user out exactly like the reset bug, because BA verifies
  account.password and the login self-heal only fires when no account row exists.

Verified: 51 core auth/RBAC unit tests pass; admin login + /admin/users + /admin/rbac
+ /admin/dashboard all 200 on a local build.

Co-Authored-By: Mark (Infowall) <security@infowall.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Close the perimeter-only authz gap: every admin section now independently
enforces a requireRbac gate so a user with only portal:access cannot browse
to settings, logs, media, collections, forms, or plugins.

Route changes (all files: requireAuth stays, add requireRbac):
  admin-settings   — GET* settings:read, POST* settings:manage (inline checks removed)
  admin-plugins    — GET* plugins:manage, POST* plugins:manage (inline checks removed)
  admin-logs       — use* logs:read (covers GET + POST search/export/cleanup)
  admin-media      — GET* media:read, POST /upload media:create
  admin-collections — GET* collections:read (POST/PUT/DELETE already had manage)
  admin-forms      — GET* content:read, POST/PUT/DELETE* content:manage

Caching:
  RbacService: optional KVNamespace param, 60s TTL cache on permissionsForUser;
    setUserRoles busts rbac:perms:{userId} on role change.
  app.ts: stores perms in c.set('rbacPerms') before await next() so all
    subsequent requireRbac calls within the same request use the cached array
    (zero extra DB hits after the first permissionsForUser expansion).
  requireRbac: fast-path checks c.get('rbacPerms') before querying DB.
  admin-rbac /grants: busts all rbac:perms:* KV keys after matrix save.

RBAC resources: added 'logs' to SYSTEM_RESOURCES so logs:read appears in
  the permission matrix and in the pre-computed perms cache.

Tests: updated admin-settings and admin-plugins mocks to include requireRbac
  passthrough; updated admin-plugins deny assertions to match middleware
  error format ('Insufficient permissions' vs old inline 'Access denied').

E2E: added tests/e2e/65-rbac-deny-path.spec.ts — unauthenticated requests to
  all gated sections must redirect to /auth/login.

Result: 1496 tests pass (up from 1414 pre-baseline), 12 pre-existing failures
  in auth.test.ts and api-public-content-status.test.ts unchanged.

Co-Authored-By: MarkMac <mmcintosh@users.noreply.github.com>
Remove three dead legacy endpoints that are now fully superseded by
Better Auth's session-based flows:

  POST /auth/register  (JSON + JWT cookie)  → replaced by /auth/register/form
  POST /auth/login     (JSON + JWT cookie)  → replaced by /auth/login/form
  POST /auth/refresh   (JWT sliding window) → superseded by BA session renewal

These endpoints still generated HS256 JWT auth_token cookies, which the
app's session middleware no longer reads (c.get('user') is populated by
the BA session, not the JWT cookie). Removing them eliminates ~400 lines
of dead code and closes an authentication-confusion vector where a client
could hold a valid JWT that the server would never actually trust.

Also: POST /accept-invitation now signs in via BA (signInEmail) after
activating the user account, replacing the JWT auto-login. This is the
last path in auth.ts that was setting the auth_token JWT cookie.

Imports cleaned: removed getJwtRefreshGraceSecondsFromDb (only used by
the removed refresh endpoint).

AuthManager stays — still required for hashPassword (accept-invitation,
admin set-password) and ensureCredentialAccount.

Co-Authored-By: MarkMac <mmcintosh@users.noreply.github.com>
…eat-model tests

## Phase 4 remainder — BA-native login methods
auth/config.ts now enables:
  • magicLink() — BA magic-link passwordless auth; replaces standalone magic-link-auth
    plugin that minted JWT cookies. Email callback is console-logged in dev (TODO: wire
    to Sendgrid / email plugin in prod).
  • emailOTP() — BA email-OTP; replaces otp-login-plugin. Same 6-digit / 10-min TTL.
  • socialProviders — GitHub + Google activated when GITHUB_CLIENT_ID/SECRET and
    GOOGLE_CLIENT_ID/SECRET env vars are set; replaces bespoke oauth-providers plugin.
  app.ts: added GITHUB_CLIENT_ID, GITHUB_CLIENT_SECRET, GOOGLE_CLIENT_ID,
    GOOGLE_CLIENT_SECRET to Bindings so wrangler.toml can expose them.

## Phase 5 — Security hardening (porting Mark's discipline)
  CSRF: middleware/csrf.ts now prefers BETTER_AUTH_SECRET over JWT_SECRET (JWT_SECRET
    retained as legacy fallback while deployments rotate). Same change in setCsrfCookie
    in routes/auth.ts.
  Atomic token consume: POST /auth/reset-password rewrites from two-round-trip
    (SELECT + UPDATE) to a single atomic UPDATE keyed on the token value; checks
    meta.changes === 1. A concurrent second use returns changes=0 → 400.
  Password history: before hashing the new password, verifies it doesn't match any
    of the last 5 entries in password_history (PBKDF2 constant-time compare).
  Account lockout: POST /auth/login/form checks failed_login_count and locked_until
    before attempting BA sign-in. After LOCKOUT_THRESHOLD (5) failures the account is
    locked for 15 minutes. Locked accounts return the same "Invalid email or password"
    message as wrong-password to prevent enumeration. Successful login resets both
    counters. Also resets counters on password-reset (atomic UPDATE).
  Migration 041: adds failed_login_count + locked_until columns to users table with
    a partial index on locked_until.
  Threat-model test suite (12 tests in services/threat-model.test.ts):
    lockout counter/reset, locked=invalid-msg (anti-enum), atomic token consume
    semantics, expiry vs invalid same error, password history check, CSRF signing +
    tampering resistance, password hash/verify invariants.

## Phase 6 — BA plugin scaffolding
  twoFactor() plugin enabled — TOTP enrollment at /auth/two-factor/*; requires 2FA
    table (migration 042).
  organization() plugin enabled — multi-tenant teams at /auth/organization/*;
    requires org tables (migration 042).
  Migration 042: creates twoFactor, organization, member, invitation, team tables
    + adds users.two_factor_enabled column.

Tests: 1508 passing (up from 1496), 12 pre-existing failures unchanged.

Co-Authored-By: MarkMac <mmcintosh@users.noreply.github.com>
…recation

## Phase 4 — Deprecate legacy JWT-minting plugins
oauth-providers, magic-link-auth, and otp-login-plugin all minted JWT auth_token
cookies that SonicJS no longer reads (BA session middleware replaced them).
Added @deprecated JSDoc on each plugin with migration instructions:
  oauth-providers → GITHUB/GOOGLE_CLIENT_ID/SECRET env vars (BA socialProviders)
  magic-link-auth → BA magicLink() plugin at /auth/magic-link/*
  otp-login-plugin → BA emailOTP() plugin at /auth/email-otp/*
Plugins are NOT deleted — teams may have active users; they can be disabled
from /admin/plugins when ready to cut over.

## Phase 5 — Migration governance
Migration CI (9 tests in src/db/migrations-bundle.test.ts):
  • bundle count === .sql file count (catches generate-migrations.ts not being run)
  • every .sql filename appears in the bundle
  • no duplicate migration IDs
  • every entry has non-empty SQL
  • bundle SQL matches source file content byte-for-byte
  • IDs are monotonically increasing
  • no unguarded DROP TABLE (ALTER TABLE ADD COLUMN info-logged)
Removed dead-code stray migrations from src/db/migrations/ (never included
in the bundle; 0009_create_logging_tables.sql was a duplicate of 009_system_logging.sql;
0010_oauth_accounts.sql was for the now-deprecated oauth-providers plugin).

## Schema.ts reconciliation
Added missing columns to the Drizzle `users` table definition:
  passwordResetToken, passwordResetExpires (password-reset flow)
  invitationToken, invitedAt, acceptedInvitationAt (invitation flow)
  failedLoginCount, lockedUntil (migration 041 lockout)
  twoFactorEnabled (migration 042 / BA twoFactor plugin)

Tests: 1517 passing (up from 1508 pre-session), 12 pre-existing failures unchanged.

Co-Authored-By: MarkMac <mmcintosh@users.noreply.github.com>
Item 1 — rotate dev secret (.dev.vars updated locally; gitignored so not committed).
  Prod rotation: run `wrangler secret put BETTER_AUTH_SECRET`.

Item 2 — sendMagicLink / sendVerificationOTP now call sendViaEmailPlugin().
  Helper loads email plugin settings (apiKey/fromEmail/fromName) from the
  plugins table at runtime and sends via Resend API. Falls back to console.log
  when the email plugin is unconfigured (dev / CI).

Item 3 — remove deprecated legacy JWT plugin registrations from app.ts:
  - otpLoginPlugin (replaced by BA emailOTP routes at /auth/email-otp/*)
  - oauthProvidersPlugin (replaced by BA socialProviders)
  - createMagicLinkAuthPlugin (replaced by BA magicLink at /auth/magic-link/*)
  Removed imports + route registration blocks. Plugin source files remain for
  teams mid-cutover; uninstall from /admin/plugins when ready.

Co-Authored-By: MarkMac <mmcintosh@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant