From 2a98097a623d05c2f4958149c8664a4d1132ae60 Mon Sep 17 00:00:00 2001 From: JOY Date: Tue, 26 May 2026 12:10:01 +0700 Subject: [PATCH] docs: align NPC voice lane with Ida Faber pipeline --- .claude/CLAUDE.md | 32 ++++--- AGENTS.md | 32 ++++--- .../Scripts/Settings/SecondSpawnConfig.cs | 2 +- docs/ARCHITECTURE.md | 12 ++- docs/adr/0003-llm-safety-architecture.md | 2 +- docs/adr/0005-unity-6-5-beta.md | 4 +- docs/design/00-game-concept.md | 4 +- docs/design/01-pillars.md | 4 +- docs/design/02-vertical-slice-spec.md | 4 +- docs/design/03-systems-index.md | 10 +- docs/design/05-networking-architecture.md | 4 +- docs/design/06-overview-design.md | 4 +- .../design/11-npc-agent-brain-architecture.md | 2 +- docs/design/12-game-design-document.md | 4 +- .../13-human-believable-npc-agent-model.md | 2 +- ...16-npc-society-multi-agent-architecture.md | 95 ++++++++++--------- ...-openclaw-agent-connection-architecture.md | 2 +- docs/design/36-ai-npc-research-anchor-map.md | 6 +- .../37-ai-npc-backend-client-roadmap.md | 8 +- .../38-alpha-design-decision-register.md | 2 +- ...ed-npc-dialogue-portrait-lipsync-design.md | 42 ++++---- docs/setup/agent-handoff.md | 4 +- docs/setup/github-project-tracking.md | 4 +- docs/setup/unity-conventions.md | 2 +- 24 files changed, 147 insertions(+), 140 deletions(-) diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index 9b1d9ea9..14e8360b 100644 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -78,7 +78,7 @@ Cultivation Master mechanics without a new approved design update. ### Engine -- Unity 6.5 beta (currently `6000.5.0b9`) + URP. JOY explicitly chose beta over Unity 6.0 LTS for newest features; accept risk of breaking changes between beta builds and that some 3rd-party assets (Opsive UCC, Behavior Designer, Convai) may not be tested against this version yet. Re-evaluate if beta blocks progress. +- Unity 6.5 beta (currently `6000.5.0b9`) + URP. JOY explicitly chose beta over Unity 6.0 LTS for newest features; accept risk of breaking changes between beta builds and that some 3rd-party assets (Opsive UCC, Behavior Designer, Ida Faber character packs) may not be tested against this version yet. Re-evaluate if beta blocks progress. - Force Text serialization (default Unity 6 - DO NOT change) ### Networking @@ -92,7 +92,7 @@ Cultivation Master mechanics without a new approved design update. - Opsive Ultimate Character Controller (ARPG character + combat) - Behavior Designer (NPC combat behavior tree, NOT dialogue AI) -- Convai (NPC dialogue - phase 1 only, migrate to custom LLM phase 2) +- Ida Faber character packs (semi-real character bodies, Unity Humanoid retargeting, and ARKit 52 blendshape candidates) - TextMeshPro (UI) - Unity ML-Agents (deferred, future research) @@ -108,20 +108,23 @@ Cultivation Master mechanics without a new approved design update. - **`api.dos.ai` model service** (shared AI service; provider keys, model routing, prompt safety, voice token minting, AI-specific endpoints only) - **Redis** (session, rate limit, transient cache) -### LLM +### LLM And NPC Voice **Phase 1 (MVP):** -- Convai SDK in Unity for NPC dialogue -- Limit: no full custom LLM control, accept Convai cost +- Custom NPC dialogue through Nakama or the dedicated server to `api.dos.ai`. +- Text-first focused dialogue and ambient NPC speech. +- Unity client never embeds an LLM or voice vendor SDK for authoritative dialogue. +- Voice, TTS, and facial animation are optional presentation layers behind scoped server-minted sessions, not dialogue dependencies. -**Phase 2 (post-MVP):** +**Phase 2 (post-MVP / advanced presentation):** -- Migrate LLM calls to `api.dos.ai`, models: +- Route model calls inside `api.dos.ai`, models: - Haiku 4.5 for NPC chat (fast, cheap) - Sonnet 4.6 for boss / quest-critical dialog - RAG memory: Supabase pgvector or Qdrant -- Voice: OpenAI Realtime API via ephemeral token (NOT API key in client) OR ElevenLabs +- Voice: custom `api.dos.ai` voice session lane using OpenAI Realtime, ElevenLabs, or self-hosted TTS via ephemeral token only (NOT API key in client) +- Facial animation: custom Ida Faber / ARKit-style blendshape profiles after verifying the imported Unity `SkinnedMeshRenderer` blendshape names - Client AI: Unity Sentis for small perception (optional, phase 3) ### OpenClaw-Connected NPCs (CONCEPT) @@ -187,7 +190,7 @@ Cultivation Master mechanics without a new approved design update. - **Game server:** Linux headless Unity build on Hetzner VPS, Dockerized - **Nakama backend:** self-hosted OSS first; Heroic Cloud only if operations become worth paying for - **AI model service:** `api.dos.ai` -- **LLM API:** Convai phase 1, then Anthropic + OpenAI phase 2 +- **LLM API:** `api.dos.ai` model service, with Anthropic / OpenAI / future providers hidden behind server-side routing - **Monitoring:** Sentry (error) + Grafana (metrics) ### Testing @@ -250,7 +253,7 @@ Cultivation Master mechanics without a new approved design update. owner. 6. Only one agent may mutate Unity scenes, prefabs, package imports, or project settings at a time. Read-only inspection by the next agent is OK after the previous agent reports dirty files and current console state. 7. Every agent switch must leave a handoff using `docs/setup/agent-handoff.md`. -8. Unity package imports happen one package per commit, in this order unless JOY changes it: Opsive Ultimate Character Controller, Behavior Designer, Convai. +8. Unity package imports happen one package per commit unless JOY changes it. Current priority order: Opsive Ultimate Character Controller, Behavior Designer, Ida Faber character packs. 9. Before claiming Unity work is complete, check the Unity console and active scene through MCP or the Editor. 10. Significant commits still require independent reviewer pass per Hard Rule #7. @@ -303,7 +306,8 @@ Cultivation Master mechanics without a new approved design update. - Photon Fusion Dedicated Server overview - Opsive Ultimate Character Controller getting started - Behavior Designer manual -- Convai Unity SDK docs +- Ida Faber Unity retargeting docs +- Ida Faber blendshapes and facial animation docs - Unity Multiplayer Play Mode tutorial - Coplay unity-mcp + Claude Code setup guide - `api.dos.ai` model-service contract patterns from JOY's existing DOS.AI stack @@ -383,7 +387,7 @@ Scope: - 1 small Town or Relay Yard with at least one useful facility or decor/care interaction - 1 dungeon instance -- 1 boss with LLM dialogue (Convai) +- 1 boss with custom LLM dialogue through `api.dos.ai` - 1 questline (3-5 quests) - Reinhabitation MVP (die -> spend test SECOND -> inhabit another eligible body with selected reset) @@ -407,7 +411,7 @@ OUT of scope for vertical slice: 1. **NEVER copy MetaDOS gameplay code.** Extract patterns only. Reference path: `D:\Projects\MetaDOS` (read-only). 2. **NEVER let LLM mutate authoritative game state.** Server validates all intent. -3. **NEVER put API keys (Anthropic, OpenAI, Convai, ElevenLabs) in Unity client.** All LLM calls go through Nakama or the dedicated server to `api.dos.ai`. +3. **NEVER put API keys (Anthropic, OpenAI, ElevenLabs, or any voice/model provider) in Unity client.** All LLM calls go through Nakama or the dedicated server to `api.dos.ai`. 4. **NEVER use Host Mode for production.** Server Mode dedicated only. 5. **NEVER add or replace backend / auth / social stack without an ADR and JOY approval.** Nakama OSS is the accepted game backend baseline per ADR 0010. Heroic Cloud, Hiro, Satori, OpenAuth, PlayFab, AccelByte, or a Supabase-first rollback require a new ADR. 6. **NEVER change Unity Asset Serialization away from Force Text.** Breaks LFS + diff. @@ -422,7 +426,7 @@ OUT of scope for vertical slice: - Advanced body or soul progression replacing the deferred concept - Hunter NFT integration approach: Option 1 (preset hero) vs Hybrid 1+3 (modular pieces) - Phase 2 LLM model split (when to use Haiku vs Sonnet) -- Voice NPC vendor (OpenAI Realtime vs ElevenLabs vs self-host) +- Voice NPC vendor behind `api.dos.ai` (OpenAI Realtime vs ElevenLabs vs self-host) - Backend deployment path: self-hosted Nakama OSS vs Heroic Cloud later. Hiro / Satori require license and pricing review before adoption. - Dedicated server hosting (Hetzner specs, region) - Photon Fusion 2 license tier when scaling beyond Cloud free 20 CCU diff --git a/AGENTS.md b/AGENTS.md index 6596b8fe..6d66fa59 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -78,7 +78,7 @@ Cultivation Master mechanics without a new approved design update. ### Engine -- Unity 6.5 beta (currently `6000.5.0b9`) + URP. JOY explicitly chose beta over Unity 6.0 LTS for newest features; accept risk of breaking changes between beta builds and that some 3rd-party assets (Opsive UCC, Behavior Designer, Convai) may not be tested against this version yet. Re-evaluate if beta blocks progress. +- Unity 6.5 beta (currently `6000.5.0b9`) + URP. JOY explicitly chose beta over Unity 6.0 LTS for newest features; accept risk of breaking changes between beta builds and that some 3rd-party assets (Opsive UCC, Behavior Designer, Ida Faber character packs) may not be tested against this version yet. Re-evaluate if beta blocks progress. - Force Text serialization (default Unity 6 - DO NOT change) ### Networking @@ -92,7 +92,7 @@ Cultivation Master mechanics without a new approved design update. - Opsive Ultimate Character Controller (ARPG character + combat) - Behavior Designer (NPC combat behavior tree, NOT dialogue AI) -- Convai (NPC dialogue - phase 1 only, migrate to custom LLM phase 2) +- Ida Faber character packs (semi-real character bodies, Unity Humanoid retargeting, and ARKit 52 blendshape candidates) - TextMeshPro (UI) - Unity ML-Agents (deferred, future research) @@ -108,20 +108,23 @@ Cultivation Master mechanics without a new approved design update. - **`api.dos.ai` model service** (shared AI service; provider keys, model routing, prompt safety, voice token minting, AI-specific endpoints only) - **Redis** (session, rate limit, transient cache) -### LLM +### LLM And NPC Voice **Phase 1 (MVP):** -- Convai SDK in Unity for NPC dialogue -- Limit: no full custom LLM control, accept Convai cost +- Custom NPC dialogue through Nakama or the dedicated server to `api.dos.ai`. +- Text-first focused dialogue and ambient NPC speech. +- Unity client never embeds an LLM or voice vendor SDK for authoritative dialogue. +- Voice, TTS, and facial animation are optional presentation layers behind scoped server-minted sessions, not dialogue dependencies. -**Phase 2 (post-MVP):** +**Phase 2 (post-MVP / advanced presentation):** -- Migrate LLM calls to `api.dos.ai`, models: +- Route model calls inside `api.dos.ai`, models: - Haiku 4.5 for NPC chat (fast, cheap) - Sonnet 4.6 for boss / quest-critical dialog - RAG memory: Supabase pgvector or Qdrant -- Voice: OpenAI Realtime API via ephemeral token (NOT API key in client) OR ElevenLabs +- Voice: custom `api.dos.ai` voice session lane using OpenAI Realtime, ElevenLabs, or self-hosted TTS via ephemeral token only (NOT API key in client) +- Facial animation: custom Ida Faber / ARKit-style blendshape profiles after verifying the imported Unity `SkinnedMeshRenderer` blendshape names - Client AI: Unity Sentis for small perception (optional, phase 3) ### OpenClaw-Connected NPCs (CONCEPT) @@ -187,7 +190,7 @@ Cultivation Master mechanics without a new approved design update. - **Game server:** Linux headless Unity build on Hetzner VPS, Dockerized - **Nakama backend:** self-hosted OSS first; Heroic Cloud only if operations become worth paying for - **AI model service:** `api.dos.ai` -- **LLM API:** Convai phase 1, then Anthropic + OpenAI phase 2 +- **LLM API:** `api.dos.ai` model service, with Anthropic / OpenAI / future providers hidden behind server-side routing - **Monitoring:** Sentry (error) + Grafana (metrics) ### Testing @@ -250,7 +253,7 @@ Cultivation Master mechanics without a new approved design update. owner. 6. Only one agent may mutate Unity scenes, prefabs, package imports, or project settings at a time. Read-only inspection by the next agent is OK after the previous agent reports dirty files and current console state. 7. Every agent switch must leave a handoff using `docs/setup/agent-handoff.md`. -8. Unity package imports happen one package per commit, in this order unless JOY changes it: Opsive Ultimate Character Controller, Behavior Designer, Convai. +8. Unity package imports happen one package per commit unless JOY changes it. Current priority order: Opsive Ultimate Character Controller, Behavior Designer, Ida Faber character packs. 9. Before claiming Unity work is complete, check the Unity console and active scene through MCP or the Editor. 10. Significant commits still require independent reviewer pass per Hard Rule #7. @@ -303,7 +306,8 @@ Cultivation Master mechanics without a new approved design update. - Photon Fusion Dedicated Server overview - Opsive Ultimate Character Controller getting started - Behavior Designer manual -- Convai Unity SDK docs +- Ida Faber Unity retargeting docs +- Ida Faber blendshapes and facial animation docs - Unity Multiplayer Play Mode tutorial - Coplay unity-mcp + Claude Code setup guide - `api.dos.ai` model-service contract patterns from JOY's existing DOS.AI stack @@ -383,7 +387,7 @@ Scope: - 1 small Town or Relay Yard with at least one useful facility or decor/care interaction - 1 dungeon instance -- 1 boss with LLM dialogue (Convai) +- 1 boss with custom LLM dialogue through `api.dos.ai` - 1 questline (3-5 quests) - Reinhabitation MVP (die -> spend test SECOND -> inhabit another eligible body with selected reset) @@ -407,7 +411,7 @@ OUT of scope for vertical slice: 1. **NEVER copy MetaDOS gameplay code.** Extract patterns only. Reference path: `D:\Projects\MetaDOS` (read-only). 2. **NEVER let LLM mutate authoritative game state.** Server validates all intent. -3. **NEVER put API keys (Anthropic, OpenAI, Convai, ElevenLabs) in Unity client.** All LLM calls go through Nakama or the dedicated server to `api.dos.ai`. +3. **NEVER put API keys (Anthropic, OpenAI, ElevenLabs, or any voice/model provider) in Unity client.** All LLM calls go through Nakama or the dedicated server to `api.dos.ai`. 4. **NEVER use Host Mode for production.** Server Mode dedicated only. 5. **NEVER add or replace backend / auth / social stack without an ADR and JOY approval.** Nakama OSS is the accepted game backend baseline per ADR 0010. Heroic Cloud, Hiro, Satori, OpenAuth, PlayFab, AccelByte, or a Supabase-first rollback require a new ADR. 6. **NEVER change Unity Asset Serialization away from Force Text.** Breaks LFS + diff. @@ -422,7 +426,7 @@ OUT of scope for vertical slice: - Advanced body or soul progression replacing the deferred concept - Hunter NFT integration approach: Option 1 (preset hero) vs Hybrid 1+3 (modular pieces) - Phase 2 LLM model split (when to use Haiku vs Sonnet) -- Voice NPC vendor (OpenAI Realtime vs ElevenLabs vs self-host) +- Voice NPC vendor behind `api.dos.ai` (OpenAI Realtime vs ElevenLabs vs self-host) - Backend deployment path: self-hosted Nakama OSS vs Heroic Cloud later. Hiro / Satori require license and pricing review before adoption. - Dedicated server hosting (Hetzner specs, region) - Photon Fusion 2 license tier when scaling beyond Cloud free 20 CCU diff --git a/Unity/Assets/_SecondSpawn/Scripts/Settings/SecondSpawnConfig.cs b/Unity/Assets/_SecondSpawn/Scripts/Settings/SecondSpawnConfig.cs index 3de1bdb4..c5e0284f 100644 --- a/Unity/Assets/_SecondSpawn/Scripts/Settings/SecondSpawnConfig.cs +++ b/Unity/Assets/_SecondSpawn/Scripts/Settings/SecondSpawnConfig.cs @@ -12,7 +12,7 @@ namespace SecondSpawn.Settings /// - Per-environment toggles (dev / staging / prod) /// /// What does NOT live here (Hard Rule #3 in CLAUDE.md / AGENTS.md): - /// - Anthropic / OpenAI / Convai API keys + /// - Anthropic / OpenAI / TTS provider API keys /// - Supabase service role key /// - thirdweb secret key /// - DOS Chain signing keys diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md index 7452d4a1..97e27d78 100644 --- a/docs/ARCHITECTURE.md +++ b/docs/ARCHITECTURE.md @@ -25,8 +25,8 @@ High-level architecture overview. For detailed component design see `docs/design v v +---------------+ +----------------------------+ | Nakama OSS | | api.dos.ai model service | - | - Game APIs | | - Convai (phase 1) | - | - Social | | - Anthropic + OpenAI (P2) | + | - Game APIs | | - Anthropic + OpenAI | + | - Social | | - Voice session minting | | - Storage | | - RAG memory retrieval | | - Postgres | | - AI rate limit + safety | +---------------+ +-------------+--------------+ @@ -376,8 +376,10 @@ future TTS. It is disabled by default, mints only short-lived `api.dos.ai` playback material when configured, and returns a text fallback reason when voice is unavailable. The session response is presentation-only: it cannot write memory, relationship, quest, TIME, SECOND, inventory, combat, or body -lifecycle state. Convai remains an isolated phase 1 spike lane for one boss or -hub NPC until cost, latency, reliability, and stable voice identity are proven. +lifecycle state. The facial-animation path is custom: Unity should map text, +audio amplitude, or future viseme data into character-specific profiles such as +Ida Faber / ARKit-style blendshape profiles only after the imported mesh names +are verified. #### NPC Society Event Path @@ -571,7 +573,7 @@ quest state directly. ### `api.dos.ai` Model Service - All LLM calls go through here -- Multi-provider routing (Convai phase 1, Anthropic + OpenAI phase 2) +- Multi-provider routing for Anthropic, OpenAI, TTS, and future providers - Experimental role-play providers such as Alibaba Qwen-Character are evaluated only as replaceable adapters behind `api.dos.ai`; see [design/52-llm-role-play-provider-evaluation.md](design/52-llm-role-play-provider-evaluation.md). diff --git a/docs/adr/0003-llm-safety-architecture.md b/docs/adr/0003-llm-safety-architecture.md index db8659be..4ab98fdc 100644 --- a/docs/adr/0003-llm-safety-architecture.md +++ b/docs/adr/0003-llm-safety-architecture.md @@ -11,7 +11,7 @@ Game has LLM-driven NPCs and an LLM-driven AI agent that controls the player's c ## Decision 1. **All LLM calls go through Nakama or the dedicated server to `api.dos.ai`, never directly from Unity client.** -2. **API keys (Anthropic, OpenAI, Convai, ElevenLabs) live only in server env.** Never in client, never reachable from client. +2. **API keys (Anthropic, OpenAI, ElevenLabs, or any model/voice provider) live only in server env.** Never in client, never reachable from client. 3. **LLM output is parsed into structured intent (JSON schema enforced).** Free-form text is only for display. 4. **Server validates every intent before applying state changes.** LLM cannot grant items, gold, XP, or mutate progression directly. 5. **Rate limit per player + per NPC.** Token budget cap, request count cap. diff --git a/docs/adr/0005-unity-6-5-beta.md b/docs/adr/0005-unity-6-5-beta.md index f7116746..727f70cc 100644 --- a/docs/adr/0005-unity-6-5-beta.md +++ b/docs/adr/0005-unity-6-5-beta.md @@ -20,7 +20,7 @@ Two options were on the table: - **Option A: Roll back to Unity 6.0 LTS.** Stable, supported, all 3rd-party assets (Opsive Ultimate Character Controller, Behavior - Designer, Convai, Photon Fusion 2) are tested against this version. + Designer, Ida Faber character packs, Photon Fusion 2) are tested against this version. Predictable for a 3-6 month vertical slice timeline. Ecosystem packages (URP 17.x, Input System) are stable on 6.0. - **Option B: Stay on Unity 6.5 beta.** Newer features, @@ -57,7 +57,7 @@ Per JOY's reasoning at decision time: This is a mutable decision. Re-evaluate if any of the following: -- A 3rd-party asset (Opsive UCC, Behavior Designer, Convai, Photon +- A 3rd-party asset (Opsive UCC, Behavior Designer, Ida Faber character packs, Photon Fusion 2) breaks on a beta build update - A specific Unity 6.5 feature is the reason to stay (we should name it here when that becomes true; today it is "newest features in general") diff --git a/docs/design/00-game-concept.md b/docs/design/00-game-concept.md index 476a2a06..d8022831 100644 --- a/docs/design/00-game-concept.md +++ b/docs/design/00-game-concept.md @@ -169,7 +169,7 @@ Complete one Gate run or Town objective chain, update relationships and memories | **Engine** | Unity 6.5 beta (currently `6000.5.0b9`) + URP. JOY chose beta for newest features. | | **Networking** | Photon Fusion 2 (Server Mode dedicated for production; Host Mode + Photon Cloud free 20 CCU for dev) | | **Persistence** | Nakama OSS + Postgres (profile, inventory, quest, NFT lock state, level/stats) | -| **LLM** | Convai phase 1 (NPC dialogue) -> `api.dos.ai` / api.dos.ai model service phase 2 (Haiku 4.5 for NPC chat, Sonnet 4.6 for boss / quest-critical NPCs). Server-side intent validation only. | +| **LLM** | Custom NPC dialogue through `api.dos.ai` / api.dos.ai model service (Haiku 4.5 for NPC chat, Sonnet 4.6 for boss / quest-critical NPCs). Server-side intent validation only. | | **NFT** | DOS Chain via thirdweb-api MCP. Wallet auth, escrow contracts, skin / weapon / pet inventory if kept in scope. | | **Art** | Anime-Ready Semi-Real with clean / controlled PBR as the target direction; prototype assets are scaffolding until look-dev proves the recipe | | **Key technical risks** | LLM intent validation at scale; AI agent server tick load; NFT-Unity inventory sync latency | @@ -197,7 +197,7 @@ See [docs/ARCHITECTURE.md](../ARCHITECTURE.md) for system diagram + critical inv ### Scope Risks - Solo dev + AI agent (Claude Code) + 3-6 month vertical slice on novel architecture is tight -- 3rd-party assets (Opsive UCC, Behavior Designer, Convai) may not be tested against Unity 6.5 beta. Opsive is not mandatory for the first movement prototype until it proves value against the minimal controller baseline. +- 3rd-party assets (Opsive UCC, Behavior Designer, Ida Faber character packs) may not be tested against Unity 6.5 beta. Opsive is not mandatory for the first movement prototype until it proves value against the minimal controller baseline. ### Open Questions (need JOY input later) - SECOND economy: reinhabitation cost, source, sink. (Open Decision Point in CLAUDE.md.) diff --git a/docs/design/01-pillars.md b/docs/design/01-pillars.md index e5c72591..0971e1ef 100644 --- a/docs/design/01-pillars.md +++ b/docs/design/01-pillars.md @@ -130,7 +130,7 @@ SECOND SPAWN is an AI NPC world first. The world already contains NPC-like bodie | **Security** | Prompt injection defense, capability cap, per-player rate limit | Reuse DOSafe prompt-injection patterns. | #### Serving This Pillar -- Convai phase 1 NPC dialogue grounded in player state +- Custom NPC dialogue through `api.dos.ai`, grounded in player state - Phase 2 `api.dos.ai` / api.dos.ai model service with Haiku 4.5 (NPC chat) + Sonnet 4.6 (boss / quest-critical NPCs) - Per-NPC memory in Supabase pgvector - LLM intent validation server-side, never trust raw output @@ -168,7 +168,7 @@ SECOND SPAWN is an AI NPC world first. The world already contains NPC-like bodie #### Violating This Pillar - Client-side hit detection - Client computes loot drop and tells server "I rolled X" -- Embed Anthropic / OpenAI / Convai API key in Unity client +- Embed Anthropic, OpenAI, TTS, or any model-provider API key in Unity client --- diff --git a/docs/design/02-vertical-slice-spec.md b/docs/design/02-vertical-slice-spec.md index 00047609..5c53fa9c 100644 --- a/docs/design/02-vertical-slice-spec.md +++ b/docs/design/02-vertical-slice-spec.md @@ -33,7 +33,7 @@ This is two questions in one: **is the design loop fun?** AND **is the architect | **Town / Garden** | 1 small hub with Body Hall plus one useful facility action | | **Dungeon instance** | 1 (single instance with 1 boss encounter) | | **Gate first-clear record** | Concept-only slice surface: one server-owned clear log or badge if dungeon validation exists; no in-game royalty payout yet | -| **Boss with LLM dialogue** | 1 (Convai-driven, grounded in zone state) | +| **Boss with LLM dialogue** | 1 custom `api.dos.ai`-driven NPC, grounded in zone state | | **Quest line** | 1 (3-5 quests sequential) | | **Reinhabitation MVP** | Die -> test SECOND -> another eligible body with current-body reset | | **TIME / SECOND MVP** | TIME meter shown as a readable duration, earn SECOND from a small objective, spend SECOND on one useful service, zero TIME triggers reinhabitation placeholder | @@ -185,7 +185,7 @@ The slice is considered "done" when ALL of the following are true and verified b | ---- | ---- | ---- | | 1. Setup + first commit | T+0 to T+1 | Unity project + Photon SDK + Nakama OSS + api.dos.ai LLM contract + repo structure | | 2. Networked player + zone | T+1 to T+4 | 1 zone Photon Fusion 2 multiplayer, NPC body spawn, minimal ARPG controller first, Opsive UCC evaluated after baseline | -| 3. NPC + LLM dialogue | T+4 to T+8 | Convai NPC in hub town, server-validated intent flow | +| 3. NPC + LLM dialogue | T+4 to T+8 | Custom `api.dos.ai` NPC in hub town, server-validated intent flow | | 4. Quest + dungeon | T+8 to T+12 | 1 quest line + 1 dungeon + 1 boss (LLM dialogue) | | 5. Reinhabitation + level/stat persistence | T+12 to T+16 | Death -> SECOND -> reinhabitation flow, current-body reset, profile/stat persistence | | 6. TIME / SECOND economy | T+16 to T+18 | TIME meter shown as readable duration, one SECOND earn source, one SECOND spend sink, zero-time reinhabitation trigger | diff --git a/docs/design/03-systems-index.md b/docs/design/03-systems-index.md index 3682abb7..fa4519a8 100644 --- a/docs/design/03-systems-index.md +++ b/docs/design/03-systems-index.md @@ -23,7 +23,7 @@ SECOND SPAWN is a hybrid MMO + Agent-Roster Action RPG. The mechanical scope spa body prep, scouting, crafting, and offline-agent work. - Multiplayer networking (Photon Fusion 2 dedicated server) - Persistence (Nakama OSS + Postgres, with Supabase sidecar where useful) -- LLM NPCs (Convai phase 1, api.dos.ai model service phase 2) +- LLM NPCs through `api.dos.ai` model service - AI agent autoplay (server-side, capability-capped) - OpenClaw-connected NPCs (user-owned agents as server-validated world actors) - Level/stat progression @@ -118,7 +118,7 @@ blockout tasks for the alpha. | 7 | NPC dialogue and human-believable NPC agent model | Gameplay | MVP | Design | [13-human-believable-npc-agent-model.md](13-human-believable-npc-agent-model.md), [16-npc-society-multi-agent-architecture.md](16-npc-society-multi-agent-architecture.md), [21-permanent-npc-story-characteristics.md](21-permanent-npc-story-characteristics.md), [36-ai-npc-research-anchor-map.md](36-ai-npc-research-anchor-map.md), [37-ai-npc-backend-client-roadmap.md](37-ai-npc-backend-client-roadmap.md) | api.dos.ai model service (phase 2 ready), Profile persistence | | 8 | Quest system (linear, 3-5 quests slice scope) | Gameplay | VS | Not started | [35-alpha-content-and-copy-pack.md](35-alpha-content-and-copy-pack.md), [29-alpha-questline-and-encounter-design.md](29-alpha-questline-and-encounter-design.md), [26-alpha-game-design-document.md](26-alpha-game-design-document.md), [27-alpha-level-design.md](27-alpha-level-design.md) | NPC dialogue, persistence | | 9 | Dungeon instance (1 dungeon, 1 boss) | Gameplay | VS | Not started | [27-alpha-level-design.md](27-alpha-level-design.md), [29-alpha-questline-and-encounter-design.md](29-alpha-questline-and-encounter-design.md), [15-gate-dungeon-pioneer-charter-system.md](15-gate-dungeon-pioneer-charter-system.md) | Combat, NPC dialogue, Photon | -| 10 | Boss LLM dialogue (Convai grounded) | Gameplay | VS | Not started | (TDD pending) | NPC dialogue | +| 10 | Boss LLM dialogue (`api.dos.ai` grounded) | Gameplay | VS | Not started | (TDD pending) | NPC dialogue | | 11 | AI agent for offline players (server-side) | Gameplay | VS | Prototype | [10-character-profile-agent-memory.md](10-character-profile-agent-memory.md) | NetworkRunner, api.dos.ai model service, intent schema | | 43 | NPC body embodiment / reinhabitation | Gameplay / Progression | VS | Concept | [28-character-body-and-roster-design.md](28-character-body-and-roster-design.md), [25-core-loop-v1.md](25-core-loop-v1.md), [12-game-design-document.md](12-game-design-document.md) | Actor profiles, SECOND economy, reinhabitation, body lifecycle, auth | | 41 | The Garden living roster loop | Gameplay / Meta | VS | Concept | [33-alpha-production-backlog.md](33-alpha-production-backlog.md), [26-alpha-game-design-document.md](26-alpha-game-design-document.md), [27-alpha-level-design.md](27-alpha-level-design.md), [25-core-loop-v1.md](25-core-loop-v1.md), [24-pick-me-up-reference-analysis.md](24-pick-me-up-reference-analysis.md), [37-ai-npc-backend-client-roadmap.md](37-ai-npc-backend-client-roadmap.md) | Actor profiles, NPC memory, TIME / SECOND economy, Gate missions, AI agent policy | @@ -195,7 +195,7 @@ blockout tasks for the alpha. ### Feature Layer (depends on core) 12. Combat (#6) - depends on: Player Controller, networked state -13. NPC dialogue (Convai + intent validation) (#7) - depends on: api.dos.ai model service +13. NPC dialogue (`api.dos.ai` + intent validation) (#7) - depends on: api.dos.ai model service 14. OpenClaw-connected NPC bridge (#37) - depends on: Auth, Nakama, api.dos.ai model service, NPC dialogue, LLM safety 15. Level/stat progression (#12) - depends on: Profile persistence, Combat 16. NFT inventory (#15) - depends on: Auth, thirdweb-api MCP @@ -249,7 +249,7 @@ blockout tasks for the alpha. | TIME / SECOND economy (#36) | Design + Economy | Constant drain can feel oppressive; weak drain can feel invisible | Start with danger-zone drain, one earn source, one spend sink | | Gate first-clear and Pioneer Charter economy (#38) | Economy + Anti-cheat | First-clear rewards can become exploit magnets or feel unfair if autonomous agents claim them while players sleep | Start with non-economic first-clear records; require server clear logs, caps, expiry, and human-led eligibility for economic Charters | | Photon Fusion 2 dedicated server (#1) | Technical | Solo dev capacity to run dedicated infra | Slice uses Photon Cloud free 20 CCU; production migration is post-slice | -| Convai SDK in Unity (#7) | Technical | 3rd-party SDK may not test against Unity 6.5 beta | Have phase 2 fallback (`api.dos.ai` / api.dos.ai model service + custom LLM) ready in design | +| Custom NPC dialogue through `api.dos.ai` (#7) | Technical | Model or voice provider may be slow, costly, or unavailable | Keep deterministic fallback and server-side intent validation ready in design | | The Garden living roster loop (#41) | Design + Scope | The game can drift from action RPG into heavy SLG or hero-collector management | Keep Garden actions short, mission-first, and consequence-focused; no construction timers or pay-to-win roster pulls | | NPC body embodiment / reinhabitation (#43) | Design + Economy | SECOND-paid body entry can feel like pay-to-play if framed poorly, or too punishing if every death blocks play | Present it as in-world embodiment cost, keep free/earned paths designable, and separate real-money monetization from the core rule | | Garden personalization and decor (#42) | Design + Monetization | Decor can become pay-to-win, spreadsheet production, or distracting scope creep | Keep decor cosmetic, social, memory, clue, morale, and light-utility only; timers deferred, no mandatory power, no PvP advantage | @@ -270,7 +270,7 @@ Aligned with [02-vertical-slice-spec.md](02-vertical-slice-spec.md) build phases | 6 | Zone scene management (#5) | Phase 2 | M | | | 7 | Combat (#6) | Phase 2 | L | Server-authoritative critical | | 8 | api.dos.ai model service integration (#31) | Phase 2 | M | Reuse DOSRouter pattern | -| 9 | NPC dialogue + Convai (#7) | Phase 3 | L | First LLM integration | +| 9 | NPC dialogue + `api.dos.ai` (#7) | Phase 3 | L | First LLM integration | | 10 | LLM safety (#32) | Phase 3 | M | Concurrent with #9 | | 11 | Quest system (#8) | Phase 4 | L | | | 12 | Dungeon instance (#9) | Phase 4 | L | | diff --git a/docs/design/05-networking-architecture.md b/docs/design/05-networking-architecture.md index c8f331b7..818d84a1 100644 --- a/docs/design/05-networking-architecture.md +++ b/docs/design/05-networking-architecture.md @@ -43,7 +43,7 @@ Anything less than server-authoritative breaks the fantasy on day one of public ▼ │ HTTPS ┌──────────────────┐ │ + backend token │ api.dos.ai / │ ◄─────────────────────────────┘ -│ api.dos.ai model service │ ──────────► Anthropic / OpenAI / Convai +│ api.dos.ai model service │ ──────────► Anthropic / OpenAI / TTS providers └──────────────────┘ │ ▼ @@ -158,7 +158,7 @@ These are non-negotiable per the AGPL-3.0 open-source threat model + Pillar 4 (S - Nakama endpoint and public client key - Gateway base URL (public) - Photon App ID (semi-public, client-visible by design) -3. **All LLM calls server-side via `api.dos.ai` / api.dos.ai model service.** The dedicated server or Nakama backend is the only game-side caller that requests Anthropic / OpenAI / Convai work. +3. **All LLM calls server-side via `api.dos.ai` / api.dos.ai model service.** The dedicated server or Nakama backend is the only game-side caller that requests Anthropic / OpenAI / TTS provider work. 4. **All NFT mutations server-side.** Use Nakama runtime modules or a dedicated wallet/blockchain service. Do not place game inventory or wallet mutation APIs in the model service. 5. **Rate limit + capability cap apply to AI agent the same way they apply to the player.** No "agent gets unlimited LLM tokens" - it inherits the offline player's budget. 6. **No `Host Mode` build in production.** CI staging build must use Server Mode dedicated; PR review checks this. diff --git a/docs/design/06-overview-design.md b/docs/design/06-overview-design.md index 536f6f34..5927b871 100644 --- a/docs/design/06-overview-design.md +++ b/docs/design/06-overview-design.md @@ -5,7 +5,7 @@ *Author: Codex* *Last Verified: 2026-05-14 against `AGENTS.md`, `00-game-concept.md`, `01-pillars.md`, `02-vertical-slice-spec.md`, and `05-networking-architecture.md`* -> **Quick reference** - Layer: `Core` - Priority: `Vertical Slice` - Key deps: `Photon Fusion 2`, `Nakama OSS`, `api.dos.ai model service`, `DOS Chain`, `Convai phase 1` +> **Quick reference** - Layer: `Core` - Priority: `Vertical Slice` - Key deps: `Photon Fusion 2`, `Nakama OSS`, `api.dos.ai model service`, `DOS Chain`, `Ida Faber character bodies` --- @@ -98,7 +98,7 @@ The first prototype should create a small, project-owned movement contract. Simp - Opsive UCC import as a dependency for movement baseline - Behavior Designer -- Convai +- Ida Faber character bodies and custom `api.dos.ai` dialogue - Synty / Quaternius environment art packs - Combat damage, loot, inventory, or item drops - Nakama auth or profile persistence diff --git a/docs/design/11-npc-agent-brain-architecture.md b/docs/design/11-npc-agent-brain-architecture.md index 6215dbc4..4e2fbc19 100644 --- a/docs/design/11-npc-agent-brain-architecture.md +++ b/docs/design/11-npc-agent-brain-architecture.md @@ -55,7 +55,7 @@ Important design anchors: - Hierarchical memory for persistent LLM game NPC personality. - Generative-agent observation, reflection, retrieval, and planning. - Symbolically grounded LLM dialogue from social simulation state. -- Convai-style product patterns: character description, knowledge bank, +- Vendor-reference product patterns: character description, knowledge bank, personality, state of mind, memory, narrative objectives, action API, NPC-to-NPC manager, and prompt inspection. - AI Town-style shared global state, transactions, simulation engine, and diff --git a/docs/design/12-game-design-document.md b/docs/design/12-game-design-document.md index 088dd733..2985f786 100644 --- a/docs/design/12-game-design-document.md +++ b/docs/design/12-game-design-document.md @@ -117,7 +117,7 @@ vertical-slice progression baseline. | Backend foundation | Nakama OSS + Postgres for game backend | | Network runtime | Photon Fusion 2, Server Mode dedicated for production | | AI model service | `api.dos.ai` for model calls and safety | -| Phase 1 NPC dialogue | Convai SDK for MVP NPC dialogue | +| Phase 1 NPC dialogue | Custom `api.dos.ai` NPC dialogue through Nakama or the dedicated server | | Chain integration | DOS Chain via thirdweb for wallet, NFT, and SECOND surfaces | ### Current Implementation Snapshot - 2026-05-18 @@ -1192,7 +1192,7 @@ Hard boundaries: Phase direction: -- Phase 1 uses Convai for MVP NPC dialogue. +- Phase 1 uses custom `api.dos.ai` calls for MVP NPC dialogue. - Phase 2 moves deeper LLM behavior to `api.dos.ai`. - Haiku-class models are candidates for fast NPC chat. - Sonnet-class models are candidates for bosses and quest-critical NPCs. diff --git a/docs/design/13-human-believable-npc-agent-model.md b/docs/design/13-human-believable-npc-agent-model.md index 27e72c91..981ff0df 100644 --- a/docs/design/13-human-believable-npc-agent-model.md +++ b/docs/design/13-human-believable-npc-agent-model.md @@ -60,7 +60,7 @@ and broad LLM-agent sanity checks. | [Utility system](https://en.wikipedia.org/wiki/Utility_system) | Supporting source for score-based action selection. Use the idea that traits and state bias choices, not that every NPC needs a heavy planner each frame. | | [Former Sims developer on Sims autonomy](https://www.pcgamer.com/games/the-sims/sims-dont-plan-anything-says-former-sims-4-developer-though-he-always-wanted-to-program-them-to/) | Supporting industry note for lightweight autonomy: visible behavior can come from bounded motives, traits, needs, and local choices rather than expensive long-horizon planning every tick. | | [Generative Agents: Interactive Simulacra of Human Behavior](https://arxiv.org/abs/2304.03442) | Supporting source for observation, reflection, retrieval, and planning. Do not copy full life-simulation scope. | -| [Convai Character Customization](https://docs.convai.com/api-docs/convai-playground/character-customization) / [Action API](https://docs.convai.com/api-docs/api-reference/core-api-reference/character-crafting-apis/action-api) / [NPC-to-NPC Conversation](https://docs.convai.com/api-docs/plugins-and-integrations/unity-plugin/adding-npc-to-npc-conversation) | Supporting product-pattern source for character bundles, knowledge, state of mind, memory, narrative objectives, bounded actions, and managed NPC group conversation. Learn the pattern, but keep SECOND SPAWN state and authority in Nakama and Fusion. | +| Commercial NPC platform references | Supporting product-pattern source for character bundles, knowledge, state of mind, memory, narrative objectives, bounded actions, and managed NPC group conversation. Learn the pattern, but build the SECOND SPAWN implementation through Nakama, Fusion, and `api.dos.ai`. | | [AI Town](https://github.com/a16z-infra/ai-town) | Supporting open-source source for shared global state, transactions, simulation loop, vector memory, and configurable model backends. | | [A Survey on Large Language Model based Autonomous Agents](https://link.springer.com/article/10.1007/s11704-024-40231-1) | Supporting sanity check for modern LLM-agent components such as profile, memory, planning, action, and tool use. | | [A Survey on the Memory Mechanism of LLM-based Agents](https://arxiv.org/abs/2404.13501) | Supporting sanity check for memory types, retrieval, reflection, and compression choices. | diff --git a/docs/design/16-npc-society-multi-agent-architecture.md b/docs/design/16-npc-society-multi-agent-architecture.md index a1f937a7..97bdd467 100644 --- a/docs/design/16-npc-society-multi-agent-architecture.md +++ b/docs/design/16-npc-society-multi-agent-architecture.md @@ -43,17 +43,19 @@ Central society orchestrator ## Bottom Line -Convai is currently one of the closest commercial products to the desired -direction because it treats AI characters as a bundle of identity, knowledge, -memory, narrative objective, actions, and NPC-to-NPC conversation tooling. +Some commercial NPC platforms are useful public references for product shape +because they treat AI characters as a bundle of identity, knowledge, memory, +narrative objective, actions, and NPC-to-NPC conversation tooling. -SECOND SPAWN should learn from Convai's product patterns, not depend on Convai +SECOND SPAWN should learn from vendor product patterns, not adopt a vendor SDK as the final brain runtime: -- Convai is good for phase 1 dialogue experiments, especially a boss or hub NPC. -- Convai is closed source and service-owned, so it is not a good final source - of truth for game memory, relationship state, model routing, authority, or - audit logs. +- Do not use Convai as a phase 1 dialogue dependency. +- NPC dialogue is custom and routes through Nakama or the dedicated server to + `api.dos.ai`. +- Commercial NPC platforms are service-owned, so they are not a good final + source of truth for game memory, relationship state, model routing, + authority, or audit logs. - SECOND SPAWN needs Nakama-owned durable state and Fusion-owned in-world authority because NPCs are not just chatbots. They are game actors that can become player-inhabited Frames, offline agents, and OpenClaw-connected NPCs. @@ -61,7 +63,7 @@ as the final brain runtime: The preferred production direction is: ```text -Convai-inspired character product model +Vendor-reference character product model + Generative Agents memory/planning loop + AI Town shared-state simulation pattern + AutoGen-style conversation turn manager @@ -80,17 +82,17 @@ conversation sessions, and multi-NPC scheduling. | Source | Useful Lesson For SECOND SPAWN | | ---- | ---- | -| [Convai Unity Plugin](https://docs.convai.com/api-docs/plugins-and-integrations/unity-plugin) | AI NPCs are packaged as engine-integrated characters with dialogue, behavior, and game-world adaptation. | -| [Convai Character Customization](https://docs.convai.com/api-docs/convai-playground/character-customization) | A practical character stack includes description, language, knowledge, personality traits, state of mind, memory, narrative design, and external API hooks. | -| [Convai Agentic Platform](https://convai.com/blog/agentic-platform-virtual-worlds-convai) | The latest public architecture direction is a dual mind: fast reactive response plus longer-horizon reasoning, proactive action, memory, contextual animation, vision, and group dynamics. | -| [Convai Live APIs](https://docs.convai.com/api-docs/api-reference/core-api-reference/live-apis-beta) | Real-time NPC interaction now uses WebRTC-style low-latency live sessions with text, audio, contextual data, and session continuity. | -| [Convai Mindview](https://convai.com/blog/introducing-mindview-see-exactly-what-your-convai-character-sends-to-the-model) | Prompt transparency is a product feature. Teams need to inspect which backstory, personality, narrative objective, knowledge snippets, and long-term memory entered a turn. | -| [Convai Knowledge Bank](https://convai.com/blog/building-ai-characters-knowledge-bank-with-convai) | Character knowledge should be organized, concise, tagged, versioned, and role-relevant. Long static lore belongs in retrieval, not every prompt. | -| [Convai State of Mind](https://docs.convai.com/api-docs/convai-playground/character-customization/state-of-mind) | Emotion is inspectable runtime state, not only prose style. SECOND SPAWN should expose mood, stress, and need pressure in debug UI. | -| [Convai Narrative Design](https://convai.com/blog/convai-narrative-design) | Purely open-ended LLM NPCs stagnate. Objectives, sections, triggers, and decisions keep conversations relevant to gameplay. | -| [Convai External API](https://docs.convai.com/api-docs/convai-playground/character-customization/external-api) | Character tool access is defined as named methods with input schema, implementation, execution cap, and activation state. SECOND SPAWN should use the same schema-first idea, but never expose raw game mutation tools to the model. | -| [Convai Action API](https://docs.convai.com/api-docs/api-reference/core-api-reference/character-crafting-apis/action-api) | LLM output should map to predefined actions, objects, and characters. The environment controls what is possible. | -| [Convai NPC-to-NPC Conversation](https://docs.convai.com/api-docs/plugins-and-integrations/unity-plugin/adding-npc-to-npc-conversation) | NPC group conversation needs a manager, group list, topic, distance threshold, and speech bubble presentation. | +| Commercial NPC Unity plugin references | AI NPCs are often packaged as engine-integrated characters with dialogue, behavior, and game-world adaptation. | +| Commercial character customization references | A practical character stack includes description, language, knowledge, personality traits, state of mind, memory, narrative design, and external API hooks. | +| Agentic virtual-world platform references | A useful public architecture direction is a dual mind: fast reactive response plus longer-horizon reasoning, proactive action, memory, contextual animation, vision, and group dynamics. | +| Live voice/session API references | Real-time NPC interaction can use WebRTC-style low-latency live sessions with text, audio, contextual data, and session continuity. | +| Mindview-style prompt inspection references | Prompt transparency is a product feature. Teams need to inspect which backstory, personality, narrative objective, knowledge snippets, and long-term memory entered a turn. | +| Knowledge bank references | Character knowledge should be organized, concise, tagged, versioned, and role-relevant. Long static lore belongs in retrieval, not every prompt. | +| State of mind references | Emotion is inspectable runtime state, not only prose style. SECOND SPAWN should expose mood, stress, and need pressure in debug UI. | +| Narrative design references | Purely open-ended LLM NPCs stagnate. Objectives, sections, triggers, and decisions keep conversations relevant to gameplay. | +| External API references | Character tool access is defined as named methods with input schema, implementation, execution cap, and activation state. SECOND SPAWN should use the same schema-first idea, but never expose raw game mutation tools to the model. | +| Bounded action API references | LLM output should map to predefined actions, objects, and characters. The environment controls what is possible. | +| NPC-to-NPC conversation references | NPC group conversation needs a manager, group list, topic, distance threshold, and speech bubble presentation. | | [Generative Agents](https://arxiv.org/abs/2304.03442) | Observation, reflection, planning, and memory retrieval are core ingredients for believable emergent social behavior. | | [AI Town](https://github.com/a16z-infra/ai-town) | A deployable AI society benefits from shared global state, transactions, simulation engine, vector memory, and configurable model backends. | | [AutoGen group chat patterns](https://microsoft.github.io/autogen/0.2/docs/tutorial/conversation-patterns/) | Multi-agent conversation needs speaker selection, round caps, and a manager. Round-robin, random, manual, and LLM-selected speaker strategies are all useful. | @@ -99,18 +101,18 @@ conversation sessions, and multi-NPC scheduling. --- -## Convai Pattern Audit +## Commercial NPC Platform Pattern Audit -Convai is directionally right in five areas. +Commercial NPC platforms are directionally useful in five areas. ### 1. Character as a product bundle -Convai does not treat an NPC as only a prompt. Its customization surface +Useful commercial NPC tooling does not treat an NPC as only a prompt. Its customization surface includes description, knowledge, personality, state of mind, memory, narrative design, and external APIs. SECOND SPAWN should mirror this as structured game state: -| Convai Concept | SECOND SPAWN Equivalent | +| Vendor Concept | SECOND SPAWN Equivalent | | ---- | ---- | | Character Description | `FrameIdentity` plus `FrameSoul` | | Personality Traits | `CharacterTraits` and `BehaviorTendencies` | @@ -125,7 +127,7 @@ state: ### 2. Knowledge is retrieval, not prompt stuffing -Convai's Knowledge Bank guidance is directly applicable: entries should be +Knowledge bank guidance is directly applicable: entries should be concise, role-relevant, categorized, tagged, versioned, and updated. SECOND SPAWN should avoid feeding a full GDD or full raw transcript into every NPC call. @@ -140,7 +142,7 @@ Send fresh world delta separately. ### 3. Narrative objective is a guardrail -Convai's Narrative Design is important because game NPCs should not be purely +Narrative design guidance is important because game NPCs should not be purely open-ended chatbots. A conversation should usually have a current objective: - greet a nearby player @@ -156,7 +158,7 @@ so it does not drift into irrelevant monologues. ### 4. Actions are declared before generation -Convai's Action API asks developers to define actions, objects, characters, +Bounded action API patterns ask developers to define actions, objects, characters, and backstory. That pattern matches SECOND SPAWN's hard safety rules: ```text @@ -174,23 +176,23 @@ This is exactly the right mental model for animation and gameplay: ### 5. NPC-to-NPC needs a manager -Convai's NPC-to-NPC docs use a manager object, group list, topic, and distance +NPC-to-NPC conversation references use a manager object, group list, topic, and distance threshold. SECOND SPAWN should not let every nearby NPC decide independently whether to speak at the same instant. A manager should create small conversation sessions and choose speakers. --- -## Convai Deep Architecture Audit +## Commercial NPC Platform Deep Architecture Audit -Convai's newer public architecture is no longer just "send prompt, get NPC +Modern commercial NPC architecture is no longer just "send prompt, get NPC line." It is closer to an embodied-agent product stack. ### Dual mind model -Convai describes two cooperating minds: +A useful vendor-reference model describes two cooperating minds: -| Convai Layer | Role | SECOND SPAWN Equivalent | +| Vendor Layer | Role | SECOND SPAWN Equivalent | | ---- | ---- | ---- | | Reactive mind | Fast response to voice, vision, scene metadata, and immediate user input | Unity presentation plus Nakama lightweight AI tick | | Reasoning mind | Longer-horizon planning, tool/service consultation, goals, proactive actions, and inner monologue | `NpcSocietyOrchestrator`, `NpcBrainRuntime`, memory retrieval, and api.dos.ai decision calls | @@ -203,7 +205,7 @@ The reasoning layer should decide only high-level intent and social content. ### Always listening, but not always speaking -Convai's agentic direction emphasizes characters that listen or observe +Agentic NPC platform direction emphasizes characters that listen or observe continuously, then decide when to respond. SECOND SPAWN should copy the product behavior, not the implementation. @@ -232,7 +234,7 @@ focused player dialog. ### Prompt assembly is a first-class product surface -Mindview is one of Convai's most important ideas. It exposes the actual prompt +Mindview-style inspection is one of the most important vendor-reference ideas. It exposes the actual prompt components used for a response, including character description, language, personality traits, narrative design, knowledge, and long-term memory. @@ -257,7 +259,7 @@ summaries, model metadata, latency, and validation outcome. ### Knowledge Bank pattern -Convai's Knowledge Bank separates reusable domain knowledge from the character +Knowledge bank design separates reusable domain knowledge from the character itself. Files can be connected or disconnected from characters. SECOND SPAWN should model knowledge as attachable context packs: @@ -271,7 +273,7 @@ SECOND SPAWN should model knowledge as attachable context packs: | `public_rumors` | Shared blackboard rumors and fresh events. | | `private_memory` | Per-NPC memories and relationship facts. | -This gives us Convai-like retrieval while keeping state in Nakama and future +This gives us vendor-inspired retrieval while keeping state in Nakama and future vector storage. Current implementation: @@ -289,7 +291,7 @@ Current implementation: ### Narrative graph without rigid dialogue trees -Convai Narrative Design uses sections, decisions, and triggers to keep +Narrative design patterns use sections, decisions, and triggers to keep conversations goal-oriented while still flexible. SECOND SPAWN should implement a lighter version: @@ -318,7 +320,7 @@ temperature. The model needs a situational objective and recent-lines guard. ### Action API pattern -Convai's Action API defines actions, objects, characters, and backstory. It +Bounded action API patterns define actions, objects, characters, and backstory. It also recommends broad, consistent action labels rather than overly specific verbs. @@ -334,7 +336,7 @@ actor's animation capabilities. ### External API pattern -Convai lets characters call configured APIs with method descriptions, JSON +Some commercial NPC platforms let characters call configured APIs with method descriptions, JSON input schemas, implementation code, test inputs, activation state, supported models, and execution limits. @@ -370,7 +372,7 @@ Nakama and Fusion decide whether the request becomes real. ### State of Mind pattern -Convai's State of Mind is valuable because it makes emotional state visible to +State of Mind tooling is valuable because it makes emotional state visible to creators during testing. SECOND SPAWN should add an internal mood/stress/need debug panel for NPCs: @@ -387,7 +389,7 @@ These fields should bias prompts and fallbacks, not grant power or authority. ### Live session and low-latency transport pattern -Convai's newer Live APIs use WebRTC-style live sessions for low-latency text, +Live NPC APIs can use WebRTC-style live sessions for low-latency text, audio, contextual data, and session continuity. SECOND SPAWN should not put provider keys or direct model sessions in Unity, @@ -421,7 +423,7 @@ validated text bubble rather than blocking gameplay. ### Contextual animation pattern -Convai's agentic article points to contextual animation selection: sit, stand, +Agentic NPC references point to contextual animation selection: sit, stand, move, and choose motions by role/location. This directly applies to current SECOND SPAWN animation issues. @@ -481,7 +483,7 @@ Zone events | `ConversationSession` | Nakama | Small turn-taking state for 2-3 actors, objective, topic, transcript summary, and max turns. | | `IntentValidator` | Nakama plus Fusion | Shape, policy, proximity, target, cooldown, and authority validation. | | `ActionExecutor` | Fusion plus Unity | Authoritative movement/combat/interact execution plus client-side presentation. | -| `PromptTrace` | Nakama or api.dos.ai | Debug record similar to Convai Mindview, with redacted prompt components and response metadata. | +| `PromptTrace` | Nakama or api.dos.ai | Debug record similar to Mindview-style prompt inspection, with redacted prompt components and response metadata. | --- @@ -753,7 +755,7 @@ Unity displays. - [x] Require short speech output. - [x] Return source and validation reason for UI. -### Backlog: Convai-Inspired Product Patterns +### Backlog: Vendor-Reference Product Patterns - [x] Add redacted `PromptTrace` records similar to Mindview, but store only prompt component metadata, selected memory ids, selected knowledge pack ids, @@ -821,6 +823,5 @@ Unity displays. 3. How visible should relationship state be to the player: hidden, debug-only, or partially exposed through reputation language? 4. How much raw conversation text can be kept before summarization is required? -5. Should Convai remain a phase 1 dependency for one boss/hub NPC, or should - the custom Nakama and api.dos.ai path replace it earlier now that the local - prototype is already using model-backed intents? +5. Which custom `api.dos.ai` voice/session path should power the first boss or + hub NPC once text-first dialogue is stable? diff --git a/docs/design/19-openclaw-agent-connection-architecture.md b/docs/design/19-openclaw-agent-connection-architecture.md index f7983428..d0ed57d4 100644 --- a/docs/design/19-openclaw-agent-connection-architecture.md +++ b/docs/design/19-openclaw-agent-connection-architecture.md @@ -25,7 +25,7 @@ api.dos.ai remains the model service only ``` The bridge is an ecosystem integration. It is not a replacement for Nakama, -Fusion, Convai, offline player agents, or the in-game NPC brain runtime. +Fusion, custom NPC dialogue, offline player agents, or the in-game NPC brain runtime. --- diff --git a/docs/design/36-ai-npc-research-anchor-map.md b/docs/design/36-ai-npc-research-anchor-map.md index 758baced..ee41fa36 100644 --- a/docs/design/36-ai-npc-research-anchor-map.md +++ b/docs/design/36-ai-npc-research-anchor-map.md @@ -36,12 +36,12 @@ Keep all gameplay authority in Nakama and Photon Fusion. | Observation, reflection, retrieval, and planning | [Generative Agents](https://arxiv.org/abs/2304.03442) | Supporting paper for the general agent loop. | It is not the main core paper for SECOND SPAWN because it is broader life-simulation work, not a game-NPC memory/personality deployment paper. | | Multi-NPC society and shared world state | [AI Town](https://github.com/a16z-infra/ai-town), Concordia-style generative social simulation patterns | Core architecture anchor for shared state, transactions, vector memory, and multi-agent simulation loops. | Use the pattern, not the full product shape. Nakama remains the game backend and authority layer. | | Conversation turn orchestration | AutoGen group chat patterns and Microsoft agent orchestration patterns | Core orchestration reference for speaker selection, max-turn caps, and fallback behavior. | Use managed sessions. Do not let every NPC talk to every other NPC freely. | -| Character product bundle | Convai Character Customization, Mindview, Knowledge Bank, Narrative Design, Action API, NPC-to-NPC Conversation | Core product pattern for how NPC features should feel in tools and debug UI. | Learn the pattern: identity, knowledge, memory, state of mind, narrative objectives, bounded actions, inspectable prompt trace. Do not put durable state in Convai as the final source of truth. | +| Character product bundle | Commercial NPC platform references, Mindview-style debugging, knowledge banks, narrative objectives, bounded action APIs, NPC-to-NPC conversation managers | Core product pattern for how NPC features should feel in tools and debug UI. | Learn the pattern: identity, knowledge, memory, state of mind, narrative objectives, bounded actions, inspectable prompt trace. Build the implementation through Nakama, Fusion, and `api.dos.ai`. | | Role-play provider character control | [Alibaba Cloud Model Studio Role Play](https://www.alibabacloud.com/help/en/model-studio/role-play) and Qwen-Character-style role-play APIs | R&D reference for packaging character profile, session continuity, memory knobs, and allowed action labels into provider requests. | Study the request shape and session mechanics behind `api.dos.ai`. Do not let provider memory or provider sessions become canonical game state. | -| Action validation and game authority | SECOND SPAWN LLM Safety ADR plus Convai Action API pattern | Core implementation rule. | LLM output is intent only. Nakama and Fusion validate and mutate state. | +| Action validation and game authority | SECOND SPAWN LLM Safety ADR plus bounded action API pattern | Core implementation rule. | LLM output is intent only. Nakama and Fusion validate and mutate state. | | Social behavior evaluation | SOTOPIA-style social intelligence evaluation | Core evaluation direction for future tests. | Use for tests such as persona consistency, role-appropriate answers, non-repetition, hidden-lore boundaries, and relationship-aware responses. | | OpenClaw-connected NPCs | OpenClaw-style external agent identity and workspace model | Core bridge pattern for user-owned external agents. | OpenClaw agents pull game context and emit bounded dialogue or intent. They do not own authoritative game state. | -| Voice and live interaction | Convai Live APIs, OpenAI Realtime-style ephemeral session pattern, future voice provider docs | Future core anchor for voice latency and per-NPC voice identity. | Voice is deferred. Client must receive only ephemeral tokens or server-mediated playback, never provider API keys. | +| Voice and live interaction | OpenAI Realtime-style ephemeral session pattern, future voice provider docs, Ida Faber blendshape docs | Future core anchor for voice latency, per-NPC voice identity, and facial presentation. | Voice is deferred. Client must receive only ephemeral tokens or server-mediated playback, never provider API keys. | --- diff --git a/docs/design/37-ai-npc-backend-client-roadmap.md b/docs/design/37-ai-npc-backend-client-roadmap.md index bc66fdb2..581fdaf6 100644 --- a/docs/design/37-ai-npc-backend-client-roadmap.md +++ b/docs/design/37-ai-npc-backend-client-roadmap.md @@ -31,7 +31,7 @@ and the NPC remembers and changes because of what happened. | [25-core-loop-v1.md](25-core-loop-v1.md) | Defines The Garden -> Choose Risk -> Gate Mission -> Outcome Report -> Repair / Remember. | | [36-ai-npc-research-anchor-map.md](36-ai-npc-research-anchor-map.md) | Defines the research anchors for memory, personality, relationship state, orchestration, and evaluation. | | [13-human-believable-npc-agent-model.md](13-human-believable-npc-agent-model.md) | Defines traits, needs, mood, stress, memory tiers, and relationship axes. | -| [16-npc-society-multi-agent-architecture.md](16-npc-society-multi-agent-architecture.md) | Defines society orchestration, conversation sessions, PromptTrace, and Convai-inspired product patterns. | +| [16-npc-society-multi-agent-architecture.md](16-npc-society-multi-agent-architecture.md) | Defines society orchestration, conversation sessions, PromptTrace, and vendor-reference product patterns. | | [21-permanent-npc-story-characteristics.md](21-permanent-npc-story-characteristics.md) | Defines the current authored permanent NPC roster. | | [53-ai-npc-believability-evaluation.md](53-ai-npc-believability-evaluation.md) | Defines the prompt pack, memory tests, relationship tests, hidden-lore checks, and evidence pack for NPC believability. | @@ -195,8 +195,8 @@ Client features: Implementation status: text-first focused dialogue is the default. The voice-session lane now has a scoped Nakama request boundary for future -`api.dos.ai` playback material, but voice, TTS, Convai, and provider visemes -must not block focused dialogue delivery. +`api.dos.ai` playback material, but voice, TTS, Ida Faber blendshape mapping, +and provider visemes must not block focused dialogue delivery. - Keep player and NPC locked into dialogue state until exit. - Use bottom RPG-style dialogue panel for 1:1 conversations. @@ -360,7 +360,7 @@ Use that file when starting work on `#132`, `#133`, `#134`, `#135`, `#137`, | [#139](https://github.com/DOS/Second-Spawn/issues/139) | Unity / AI Agent | Focused and ambient NPC dialogue presentation. | | [#140](https://github.com/DOS/Second-Spawn/issues/140) | Unity / AI Agent / Docs | AI NPC debug tools and Play Mode verification checklist. | | [#249](https://github.com/DOS/Second-Spawn/issues/249) | AI Agent / Model Service | Role-play provider bake-off behind `api.dos.ai`, starting with Alibaba Qwen-Character as an R&D candidate. | -| [#262](https://github.com/DOS/Second-Spawn/issues/262) | Nakama / Unity / AI Agent | Scoped NPC voice session and Convai decision lane. | +| [#262](https://github.com/DOS/Second-Spawn/issues/262) | Nakama / Unity / AI Agent | Scoped NPC voice session and custom Ida Faber facial-animation lane. | ### Client, Gameplay, And DevOps Backlog diff --git a/docs/design/38-alpha-design-decision-register.md b/docs/design/38-alpha-design-decision-register.md index f1f9e012..086cc58f 100644 --- a/docs/design/38-alpha-design-decision-register.md +++ b/docs/design/38-alpha-design-decision-register.md @@ -43,7 +43,7 @@ Use this file when: | LLM authority | LLM output is dialogue or intent only. Nakama and Fusion validate and mutate game state. | Required by the open-source and anti-cheat model. | | Role-play providers | Alibaba Qwen-Character and similar role-play models are R&D candidates only, behind `api.dos.ai`. They are not the MVP backbone and cannot own canonical NPC memory. | Preserves replaceable providers while keeping durable state in Nakama and authority in Fusion. | | Skill ownership | Physical stats are mostly body-bound, but skills are origin-bound. Durable soul learning can survive body loss, while body imprint, physical body actions, equipment access, and sync decide what can be executed now. | A skill logically lives in learned consciousness, body imprint, gear, or agent routine. Body loss should hurt execution without deleting all player learning. | -| Convai product pattern | Learn from Convai patterns such as Mindview, Knowledge Bank, Narrative Design, Action API, and NPC-to-NPC manager, but do not make Convai the durable state backbone. | Keeps useful product lessons while preserving ownership in Nakama and `api.dos.ai`. | +| Commercial NPC product pattern | Learn from vendor patterns such as Mindview-style prompt inspection, knowledge banks, narrative objectives, bounded action APIs, and NPC-to-NPC managers, but do not make any vendor the durable state backbone. | Keeps useful product lessons while preserving ownership in Nakama and `api.dos.ai`. | | Advanced body progression | Cultivation and Nibirium XP remain deferred and out of alpha. | The concept felt too much like a normal XP bar and needs a fresh design pass. | --- diff --git a/docs/design/56-focused-npc-dialogue-portrait-lipsync-design.md b/docs/design/56-focused-npc-dialogue-portrait-lipsync-design.md index cb6a6666..51d6d35d 100644 --- a/docs/design/56-focused-npc-dialogue-portrait-lipsync-design.md +++ b/docs/design/56-focused-npc-dialogue-portrait-lipsync-design.md @@ -24,8 +24,8 @@ speaking -> speaking state stops when the line completes. ``` This feature supports immersion, NPC identity, screenshot readability, and -later Convai-style facial animation without making Convai or voice a hard -dependency for the first implementation pass. +later custom facial animation without making voice, TTS, or full blendshape +mapping a hard dependency for the first implementation pass. --- @@ -118,7 +118,7 @@ pass if a local catalog is safer. ## 5. Speaking Animation Tiers The design supports three implementation tiers. Dev should ship tier 1 first if -Convai, voice, or facial rig setup is not ready. +voice, TTS, or facial rig setup is not ready. ### Tier 1: Text-Timed Speaking Fallback @@ -150,8 +150,9 @@ This is acceptable for stylized alpha characters and lower-detail portraits. ### Tier 3: Viseme Or Blendshape Lip Sync -Use when Convai or another provider returns viseme, blendshape, or facial -animation frames and the character rig supports the required targets. +Use when `api.dos.ai` or another server-side provider returns viseme, +blendshape, or facial animation frames and the character rig supports the +required targets. Behavior: @@ -161,22 +162,20 @@ Behavior: - Preserve a safe fallback when a body has no compatible face rig. - Treat provider animation data as presentation only. -Convai's Unity documentation describes a lip-sync path where backend facial data -is parsed by the SDK and applied through a Convai LipSync component using -blendshape and bone effectors. It also supports preset or custom effector lists. -This matches our tier 3 direction, but it should remain behind the isolated -Convai import lane and not be required for the first focused dialogue UI pass. - -Ida Faber's blendshape documentation is especially relevant for semi-real +Ida Faber's blendshape documentation is the relevant lane for semi-real candidate bodies. It documents Apple ARKit 52 blendshapes, facial capture, and body customization morphs, which makes Ida-compatible bodies plausible targets for an ARKit-style lip sync profile. The implementation should still inspect the actual imported Unity mesh before assuming every purchased Ida character has the same available blendshape names. +Ida Faber's Unity retargeting documentation is also relevant for dialogue body +presentation. Treat locomotion, talk idles, and gesture clips as Unity Humanoid +retargeting work, separate from facial blendshape driving. + Reference: -- [Convai Unity: Adding Lip-Sync to your Character](https://docs.convai.com/api-docs/plugins-and-integrations/unity-plugin/adding-lip-sync-to-your-character) +- [Ida Faber Docs: Unity Retargeting Animations](https://docs.idafaber3d.com/unity/retargeting) - [Ida Faber Docs: Blendshapes and Facial Animation](https://docs.idafaber3d.com/features/blendshapes) ### Ida Faber ARKit Blendshape Lane @@ -269,14 +268,11 @@ Nakama owns: - prompt safety - optional TTS or facial-data provider response shaping -Convai, if used, owns: - -- provider-specific dialogue or avatar animation transport for the isolated - Convai lane only -- optional viseme, blendshape, or facial animation data +Voice or facial-animation providers may own optional transport-level audio, +viseme, blendshape, or facial animation data for a single scoped session. -Convai or any provider must not own canonical NPC memory, relationship, quest, -TIME, SECOND, inventory, combat, or body lifecycle state. +No provider may own canonical NPC memory, relationship, quest, TIME, SECOND, +inventory, combat, or body lifecycle state. --- @@ -324,9 +320,9 @@ Issues: #139, #262 Build: - Add hook points for future TTS playback and provider viseme frames. -- Do not require Convai to be imported for D1 or D2. -- When Convai is imported later, map compatible bodies through a - character-specific lip sync profile instead of one global face assumption. +- Do not require any third-party NPC dialogue SDK for D1 or D2. +- Map compatible bodies through a character-specific lip sync profile instead + of one global face assumption. - For Ida-family bodies, validate the real Unity blendshape list first and use an ARKit-style profile only when the imported mesh supports it. diff --git a/docs/setup/agent-handoff.md b/docs/setup/agent-handoff.md index ca552d18..6564f1cd 100644 --- a/docs/setup/agent-handoff.md +++ b/docs/setup/agent-handoff.md @@ -40,7 +40,7 @@ aligned while SECOND SPAWN is developed by a solo founder with AI agents. 8. When a package import is needed, import one package per commit: - Opsive Ultimate Character Controller - Behavior Designer - - Convai + - Ida Faber character packs 9. After import or script edits, check the Unity console before moving on. ## Handoff message template @@ -96,7 +96,7 @@ output before merging, cherry-picking, pushing, or claiming done. 1. Add Unity Linux Dedicated Server Build Support for Unity `6000.5.0b8` via Unity Hub before dedicated server build work. 2. Import asset store packages in separate passes: Opsive UCC, then Behavior - Designer, then Convai. + Designer, then Ida Faber character packs. ## Current Unity decisions diff --git a/docs/setup/github-project-tracking.md b/docs/setup/github-project-tracking.md index ee08e2ed..6697ab18 100644 --- a/docs/setup/github-project-tracking.md +++ b/docs/setup/github-project-tracking.md @@ -103,7 +103,7 @@ Open issues that should be added to the project: - [#23 Import Opsive UCC in an isolated Unity pass](https://github.com/DOS/Second-Spawn/issues/23) - [#24 Import Behavior Designer in an isolated Unity pass](https://github.com/DOS/Second-Spawn/issues/24) -- [#25 Import Convai in an isolated Unity pass](https://github.com/DOS/Second-Spawn/issues/25) +- [#25 Import Ida Faber character packs in an isolated Unity pass](https://github.com/DOS/Second-Spawn/issues/25) - [#26 Implement server-authoritative combat damage prototype](https://github.com/DOS/Second-Spawn/issues/26) - [#27 Add first BodyTime reward source outside debug UI](https://github.com/DOS/Second-Spawn/issues/27) - [#28 Add first BodyTime spend sink in normal play](https://github.com/DOS/Second-Spawn/issues/28) @@ -123,7 +123,7 @@ Open issues that should be added to the project: - [#260 Track alpha demo implementation umbrella](https://github.com/DOS/Second-Spawn/issues/260) - [#261 Track expanded gameplay ledgers and reward authority](https://github.com/DOS/Second-Spawn/issues/261) -- [#262 Track NPC voice session and Convai decision lane](https://github.com/DOS/Second-Spawn/issues/262) +- [#262 Track NPC voice session and custom Ida Faber facial-animation lane](https://github.com/DOS/Second-Spawn/issues/262) - [#263 Track Unity presentation and pipeline experiments](https://github.com/DOS/Second-Spawn/issues/263) - [#264 Design post-slice Bonebound Frame hidden class spec](https://github.com/DOS/Second-Spawn/issues/264) - [#265 Track post-alpha and beta roadmap decomposition](https://github.com/DOS/Second-Spawn/issues/265) diff --git a/docs/setup/unity-conventions.md b/docs/setup/unity-conventions.md index 2b593cdd..80674e8a 100644 --- a/docs/setup/unity-conventions.md +++ b/docs/setup/unity-conventions.md @@ -57,7 +57,7 @@ Folders to add INSIDE `_SecondSpawn/` as content lands (do NOT create empty - Un - Opsive Ultimate Character Controller -> `Assets/Opsive/` - Behavior Designer -> `Assets/BehaviorDesigner/` -- Convai -> `Assets/Convai/` +- Ida Faber -> `Assets/Ida Faber/` or `Assets/IdaFaber/` - Synty / Quaternius packs -> `Assets/Synty//` or `Assets/Quaternius//` Each 3rd-party folder is treated as immutable; modifications go through wrapper scripts in `_SecondSpawn/Scripts/...`.