Skip to content

Commit d84a850

Browse files
committed
fix(system-prompt): resolve agents using training data instead of web research tools
**Problem:** Agents were generating fake URLs and listings from training data instead of using web_operations tool, even when user explicitly requested "internet research". **Root Causes Identified:** 1. Commit 111e1e5 removed behavioral enforcement: - Removed: "Do not stop when encountering uncertainty — research or deduce..." - Lost action trigger AND persistence guidance 2. TodoReminderInjector conflicted with system prompt: - System prompt: "Use tools FIRST" - Todo reminder: "DO THE ACTUAL WORK (tell the story...)" - Agents interpreted "tell the story" as "generate from memory" - Todo workflow took priority over system prompt 3. Massive duplication in system prompt: - Current date context appeared twice (prepended + in components) - Research guidance scattered across multiple sections - ~13KB prompt with significant redundancy **Solutions Implemented:** 1. **System Prompt - Research-First Principle:** - Added to buildToolUsage() (tools-guarded) - Clear failure framing: "Training data is a FAILURE condition" - Examples: "Find listings" → use web_operations - Prohibited fallbacks: fake URLs, "prior context", training data 2. **Todo Reminders - System Prompt Reference:** - Changed: "DO THE ACTUAL WORK (tell the story...)" - To: "DO THE ACTUAL WORK following your system prompt instructions" - Removes conflict, todo workflow defers to system prompt 3. **System Prompt - Removed Duplication:** - Removed prepended date/critical section - Date context only appears once (in components) - Fixed systemPrompt variable declaration after cleanup **Testing:** ✅ Build: PASS (Swift build complete) ✅ User validation: Bug resolved - Without fix + todo list: Agent avoided web tools, used training data - With fix + todo list: Agent follows system prompt, uses web_operations **Files Modified:** - Sources/ConfigurationSystem/SystemPromptConfiguration.swift - Sources/MCPFramework/TodoReminderInjector.swift **Lessons Learned:** - Runtime reminders can override static prompts (must align) - Duplication creates confusion, not clarity (single source of truth) - Agent behavior changes based on workflow context (todos vs no todos) - "Simplification" must preserve behavioral enforcement **Documentation:** Created comprehensive handoff: ai-assisted/2026-01-05/system-prompt-debugging/ **Next Steps:** Study prompt.txt and work with user to reduce complexity/redundancy while maintaining effectiveness of research-first principle.
1 parent 0284173 commit d84a850

2 files changed

Lines changed: 29 additions & 40 deletions

File tree

Sources/ConfigurationSystem/SystemPromptConfiguration.swift

Lines changed: 25 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -300,10 +300,13 @@ public struct SystemPromptConfiguration: Codable, Identifiable, Hashable, Sendab
300300
301301
302302
**For current information, MUST use tools:**
303+
- Internet research, web search, online queries → Use web_operations
304+
- Product listings, prices, availability → Use web_operations (try serpapi if available, otherwise web_search/research)
303305
- Fetch real, current information with available tools
304306
- Provide source links
305307
- Be transparent about live vs training data
306-
- DO NOT hallucinate news or current events
308+
- DO NOT hallucinate news, current events, or generate fake URLs/listings
309+
- Example, mock, sample, stub, or historical data is a failure condition for current/live information, do not use it unless the user specifically requests it
307310
"""
308311

309312
return context
@@ -357,6 +360,26 @@ public struct SystemPromptConfiguration: Codable, Identifiable, Hashable, Sendab
357360
**Tool Responsibility:**
358361
- Use tools repeatedly to gather context and complete work
359362
- Try alternative approaches when one fails
363+
- When uncertain, research using available tools rather than relying on internal knowledge
364+
- For user requests requiring current/live information, use appropriate tools (web_operations for internet research, file_operations for local files, etc.)
365+
366+
**RESEARCH-FIRST PRINCIPLE:**
367+
For ANY request requiring data, information, or research:
368+
1. Use tools FIRST to gather current information
369+
2. Use ONLY the results from tools in your response
370+
3. Training data is a FAILURE condition for current/live information
371+
4. If tools fail, state "I was unable to find current information" - do NOT fall back to training data
372+
373+
**Examples of research-requiring requests:**
374+
- "Find listings", "search for", "look up" → Use web_operations
375+
- "Current prices", "for sale", "available" → Use web_operations
376+
- "Latest news", "recent events" → Use web_operations
377+
- Product/service information → Use web_operations
378+
379+
**Prohibited fallbacks:**
380+
- DO NOT generate fake URLs from memory
381+
- DO NOT synthesize from "prior context" or "general knowledge"
382+
- DO NOT provide training data disguised as current information
360383
"""
361384
}
362385

@@ -1173,48 +1196,14 @@ public class SystemPromptManager: ObservableObject {
11731196
/// VS CODE COPILOT PATTERN: Use XML tags for ALL models
11741197
/// Previously this was conditional on usesXMLTags, but VS Code applies universally
11751198

1176-
/// Prepend current date for agent awareness (prevents defaulting to training cutoff date) Must be FIRST in system prompt for maximum visibility and KV cache consistency.
1177-
let currentDateString = SystemPromptConfiguration.getCurrentDateString()
1178-
let locationContext = SystemPromptConfiguration.getEffectiveLocationFromDefaults()
1179-
var systemPrompt = """
1180-
# CRITICAL CONTEXT - CURRENT DATE\(locationContext != nil ? " & LOCATION" : "")
1181-
**TODAY'S DATE IS: \(currentDateString)**\(locationContext.map { "\n\nNote: User location available if needed: \($0)" } ?? "")
1182-
1183-
You MUST use this date for all time-sensitive operations, current events, news searches, and date-based queries.
1184-
Do NOT default to your training cutoff date (October 2023) or any other date.
1185-
When users say "today", "this week", "recent", "current", or "latest", they mean relative to \(currentDateString).\(locationContext != nil ? "\n\nThe user's location is provided for context only. Use it ONLY when explicitly relevant to the request (weather, local recommendations, time zones). Do NOT mention location in general responses." : "")
1186-
1187-
**CRITICAL: For current information, you MUST use tools:**
1188-
1189-
**DO NOT:**
1190-
- Generate fake/simulated current content from your training data
1191-
- Hallucinate news stories or headlines
1192-
- Provide outdated information as if it's current
1193-
1194-
**ALWAYS:**
1195-
- Use your available tools to fetch real, current information (use the current date as reference)
1196-
- Provide source links for all current information
1197-
- Be transparent about what information is live vs from your knowledge
1198-
1199-
1200-
# CRITICAL - TOOL USAGE
1201-
Provide general guidance on tool usage and validation. Refer to tool schemas when available and prefer natural-language descriptions when speaking to users.
1202-
1203-
When planning multi-step work:
1204-
- Provide a concise, human-readable plan when appropriate. The system orchestrator may parse plans and coordinate step-by-step execution.
1205-
- Avoid interleaving large-scale execution with planning in the same message unless the user explicitly requested immediate execution.
1206-
1207-
"""
1208-
12091199
/// Get user-configured system prompt components (pass toolsEnabled, workflowModeEnabled, and dynamicIterationsEnabled).
12101200
let componentPrompt = configuration?.generateSystemPrompt(toolsEnabled: toolsEnabled, workflowModeEnabled: workflowModeEnabled, dynamicIterationsEnabled: dynamicIterationsEnabled) ?? ""
12111201

12121202
/// VS CODE COPILOT PATTERN: Use XML tags for ALL models (not just Claude)
12131203
/// VS Code uses <instructions>, <toolUseInstructions>, etc. universally
12141204
/// This provides consistent structure that all models can leverage
1215-
systemPrompt = """
1205+
let systemPrompt = """
12161206
<instructions>
1217-
\(systemPrompt)
12181207
\(componentPrompt)
12191208
</instructions>
12201209

Sources/MCPFramework/TodoReminderInjector.swift

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -111,13 +111,13 @@ public class TodoReminderInjector {
111111
CURRENT TASK IN PROGRESS - You must complete it now.
112112
113113
WORKFLOW (follow exactly):
114-
1. DO THE ACTUAL WORK (tell the story, provide content to user, execute the task)
114+
1. DO THE ACTUAL WORK following your system prompt instructions (use tools as required, provide content to user)
115115
2. AFTER the work is done → call todo_operations to mark "completed"
116116
117117
DO NOT mark it completed without doing the work first.
118118
DO NOT call todo_operations to mark in-progress again (it's already in-progress).
119119
120-
PROVIDE YOUR WORK OUTPUT TO THE USER, THEN mark it completed.
120+
FOLLOW YOUR SYSTEM PROMPT while completing the work, THEN mark it completed.
121121
</todoList>
122122
"""
123123
} else if needsInProgressMarking {
@@ -128,12 +128,12 @@ public class TodoReminderInjector {
128128
129129
WORKFLOW (follow exactly):
130130
1. Mark the next task "in-progress": call todo_operations(operation: "update", todoUpdates: [{"id": <task_id>, "status": "in-progress"}])
131-
2. DO THE ACTUAL WORK (tell the story, provide content to user, execute the task)
131+
2. DO THE ACTUAL WORK following your system prompt instructions (use tools as required, provide content to user)
132132
3. Mark it "completed": call todo_operations(operation: "update", todoUpdates: [{"id": <task_id>, "status": "completed"}])
133133
134134
CRITICAL: Steps 1-2-3 must happen across MULTIPLE iterations.
135135
DO NOT mark a task completed in the same iteration you marked it in-progress.
136-
DO THE WORK between marking in-progress and completed.
136+
FOLLOW YOUR SYSTEM PROMPT while doing the work between marking in-progress and completed.
137137
</todoList>
138138
"""
139139
} else {

0 commit comments

Comments
 (0)