Skip to content

perf(for-you): cap my_saved_artists to 200 most-recent#805

Merged
dylanjeffers merged 1 commit into
mainfrom
perf/cap-my-saved-artists
May 13, 2026
Merged

perf(for-you): cap my_saved_artists to 200 most-recent#805
dylanjeffers merged 1 commit into
mainfrom
perf/cap-my-saved-artists

Conversation

@dylanjeffers
Copy link
Copy Markdown
Contributor

Summary

The similar_artists CTE in /v1/users/{id}/feed/for-you self-joins saves against the my_saved_artists set. For long-tenure power users (e.g. an account with thousands of saved artists) the join blows up and the planner times out — observed in prod: the request hangs >60s and returns nothing.

Cap my_saved_artists to the 200 most-recently saved artists. Recency is the right axis to cut on:

  • Old saves are weak signal of current taste anyway
  • 200 artists still gives the collaborative-filter step plenty of surface to find similar-artist candidates
  • Bounds the saves self-join cost to predictable size regardless of user tenure

Change

-- Before
my_saved_artists AS (
    SELECT DISTINCT t.owner_id AS artist_id
    FROM my_saved_tracks mst
    JOIN tracks t ON t.track_id = mst.track_id
),

-- After
my_saved_artists AS (
    SELECT t.owner_id AS artist_id, MAX(s.created_at) AS last_saved_at
    FROM saves s
    JOIN tracks t ON t.track_id = s.save_item_id
    WHERE s.user_id = @userId
      AND s.save_type = 'track'
      AND s.is_current = true
      AND s.is_delete = false
    GROUP BY t.owner_id
    ORDER BY last_saved_at DESC
    LIMIT 200
),

CTE output shape (column artist_id) is preserved, so the downstream IN (SELECT artist_id FROM my_saved_artists) and NOT IN (...) consumers in similar_artists are unchanged.

Test plan

  • ✅ All 9 existing TestV1FeedForYou_* tests pass locally against the test DB (fixtures have <200 saved artists, so the cap doesn't kick in and behavior is identical to before for the test cases).
  • go build ./api/... / go vet ./api/... clean.
  • After deploy: re-curl /v1/users/eYZmn/feed/for-you?user_id=eYZmn&limit=5 (notjulian, deep save history) — should return 200 in <2s instead of hanging.

Follow-ups not in this PR

  • Worth adding EXPLAIN ANALYZE instrumentation behind a flag to catch this class of regression earlier next time.
  • The my_artist_affinity CTE also unions saves+reposts+plays for the user unbounded — likely the next-slowest piece for power users. Worth a similar bound in a follow-up if this fix doesn't fully resolve the timeout.

🤖 Generated with Claude Code

The similar_artists CTE in /v1/users/{id}/feed/for-you self-joins
saves against my_saved_artists. For long-tenure users with thousands
of saved artists, my_saved_artists explodes and the planner times out
(observed: prod request hangs >60s for users with deep save history).

Replace the unbounded DISTINCT with a GROUP BY on owner_id ordered by
the most-recent save_at, capped at 200. Recency is the right axis —
old saves are a weaker signal of current taste anyway — and 200 artists
still gives the collaborative-filter step enough surface to find
similar-artists candidates.

The shape of the CTE (column `artist_id`) is preserved, so downstream
consumers `t1.owner_id IN (SELECT artist_id FROM my_saved_artists)`
and the `NOT IN` exclusion in similar_artists are unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dylanjeffers dylanjeffers merged commit 5e3eb1e into main May 13, 2026
4 checks passed
@dylanjeffers dylanjeffers deleted the perf/cap-my-saved-artists branch May 13, 2026 03:45
dylanjeffers added a commit that referenced this pull request May 13, 2026
## Summary

Builds on #805 (which capped \`my_saved_artists\`). After that fix the
endpoint still times out on prod for power users — the remaining
unbounded CTEs are scanning the user's full history every request:

1. **\`my_artist_affinity\`** unions saves + reposts + plays for the
caller. \`plays\` is the biggest table by far — a heavy listener can
have hundreds of thousands of rows, all scanned on every request. Cap
each source to most recent N: **200 saves, 200 reposts, 500 plays**.

2. **\`follow_set\`** is every user the caller follows; for a power-user
with thousands of follows this becomes a wide hash join against every
recent-track upload. Cap to **500 most-recently followed**.

Recency is the right axis on all three: old engagement is a weak signal
of current taste, and the bounds match the magnitude of the hidden cost
(plays >> saves ≈ reposts).

## Diff (CTEs)

\`\`\`sql
follow_set AS (
    SELECT followee_user_id AS user_id FROM follows
    WHERE follower_user_id = @userid AND is_current AND NOT is_delete
    ORDER BY created_at DESC LIMIT 500    -- new
),
my_artist_affinity AS (
    SELECT t.owner_id, LN(1 + COUNT(*)) AS affinity
    FROM (
(SELECT save_item_id ... ORDER BY created_at DESC LIMIT 200) -- new
        UNION ALL
(SELECT repost_item_id ... ORDER BY created_at DESC LIMIT 200) -- new
        UNION ALL
(SELECT play_item_id ... ORDER BY created_at DESC LIMIT 500) -- new
    ) eng JOIN tracks t ON ... GROUP BY t.owner_id
),
\`\`\`

## Test plan

- ✅ All 9 \`TestV1FeedForYou_*\` tests pass locally. Fixtures have <200
saves/<200 reposts/<500 plays/<500 follows so the caps don't kick in and
observable behavior is unchanged.
- ✅ \`go build ./api/...\` / \`go vet ./api/...\` clean.
- After deploy: \`/v1/users/eYZmn/feed/for-you?user_id=eYZmn&limit=5\`
(notjulian, deep history) — currently times out at the Cloudflare
upstream (>120s). Target: <2s.

## Follow-ups

Parallel EXPLAIN ANALYZE work happening to verify the bound shifts the
cost as expected and to flag any missing indexes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dylanjeffers added a commit that referenced this pull request May 13, 2026
## Summary

Retiring the dedicated For You feed endpoint. The clients are being
switched to use \`/v1/users/{id}/recommended-tracks\` instead — the same
endpoint that already powers the Explore page's For You section and
works fine in production. See companion PR: AudiusProject/apps#14301.

## Why

The custom \`/feed/for-you\` endpoint had repeated issues since it
shipped:

* **Auth gate bug** (fixed in #804) — global authMiddleware rejected
unsigned \`user_id\` requests, making the endpoint unreachable from the
web RC.
* **Perf** — even after #805 and #806 capped the \`my_saved_artists\`,
\`my_artist_affinity\`, and \`follow_set\` CTEs, EXPLAIN on prod showed
the \`similar_artists\` self-join still produced a 301M-row merge for
power users (and a fixed ~12s \`track_trending_scores\` scan for *every*
user due to a missing partial index). The endpoint never reliably
completed within Cloudflare's 100s upstream limit for power users.
* **Duplication** — the response shape (ranked track list for the
signed-in user) is already what \`/recommended-tracks\` returns. Two
endpoints solving the same problem isn't worth maintaining.

Consolidating on the working endpoint is simpler than continuing to
optimize the custom one.

## Removed

| File | What |
|---|---|
| \`api/v1_users_feed_for_you.go\` | Handler + the 200-row
candidate-pool SQL (4 candidate sources, similar_artists CF, diversity
pass) |
| \`api/v1_users_feed_for_you_test.go\` | 9 unit tests |
| \`api/server.go\` (1 line) | Route registration |
| \`api/auth_middleware.go\` (~10 lines) | The \`/feed/for-you\`
exemption from #804 — no longer needed |
| \`api/swagger/swagger-v1.yaml\` (~70 lines) | The endpoint's swagger
entry |

## Test plan

- ✅ \`go build ./api/...\` clean
- ✅ \`go vet ./api/...\` clean
- ✅ All remaining \`TestV1UsersFeed*\` / \`TestAuth*\` tests pass
locally
- After merge + deploy + AudiusProject/apps#14301 deploy: Feed → For You
tab on the web RC should show the same recommended tracks as Explore's
For You section.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant