feat: resolve exploration configuration from workspace overrides and update related components

lhy0718 · lhy0718 · commit 75e709aa5164 · 2026-04-02T23:38:11.000+09:00
diff --git a/ISSUES.md b/ISSUES.md
@@ -30,9 +30,9 @@ Usage rules:
 ## Active live validation issues
 
 ### LV-084 — `/explore` and `/api/exploration/status` ignore persisted exploration artifacts and always report the global disabled contract
-- Status: OPEN
+- Status: FIXED
 - Validation target: real `test/.live` TUI `/explore` output and real web `/api/exploration/status` / bootstrap state for a run that already has `experiment_tree/tree.json`, `manager_state.json`, `baseline_lock.json`, and `figure_audit/figure_audit_summary.json`
-- Environment/session context: repo head on 2026-04-02, live fixture workspace `/home/hanyong/AutoLabOS/test/.live/autolabos-live-explore-uhei2J`, run `run-explore-live`, launched with real `node /home/hanyong/AutoLabOS/dist/cli/main.js` and `node /home/hanyong/AutoLabOS/dist/cli/main.js web --host 127.0.0.1 --port 4318`
+- Environment/session context: repo head on 2026-04-02. Original failing repro used `/home/hanyong/AutoLabOS/test/.live/autolabos-live-explore-uhei2J`. Post-fix revalidation used `/home/hanyong/AutoLabOS/test/.live/autolabos-live-explore-enabled-dNua2B`, run `run-explore-live`, launched with real `node /home/hanyong/AutoLabOS/dist/cli/main.js` and `node /home/hanyong/AutoLabOS/dist/cli/main.js web --host 127.0.0.1 --port 4318`.
 - Reproduction steps:
   1. Create a real `test/.live` workspace containing a paused review run plus persisted exploration artifacts under `.autolabos/runs/run-explore-live/experiment_tree/` and `figure_audit/`.
   2. Launch a fresh TUI rooted at that workspace and run `/explore`.
@@ -44,20 +44,33 @@ Usage rules:
   - `GET /api/exploration/status?run_id=run-explore-live` returned `{\"enabled\":false,...,\"baseline_lock_status\":\"not_applicable\"}`.
   - `GET /api/bootstrap` still anchored to the correct active run and showed the run graph paused at `review`, so the disabled exploration result was not caused by selecting the wrong run.
 - Fresh vs existing session comparison:
-  - Fresh session: the first real TUI process showed the disabled exploration contract.
-  - Existing/reopened session: reopening the same persisted workspace in a second TUI process produced the same disabled exploration contract.
-  - Divergence: none observed; the behavior appears stable across fresh and reopened sessions.
+  - Fresh session before fix: the first real TUI process showed the disabled exploration contract.
+  - Existing session before fix: reopening the same persisted workspace in a second TUI process produced the same disabled exploration contract.
+  - Fresh session after fix: `/explore` showed `Enabled: true`, `Current Stage: main_agenda`, `Nodes: 2 explored / 1 promoted / 1 blocked`, `Baseline Lock: locked`, and `Fig Audit Warns: 1 (1 severe flag)`.
+  - Existing session after fix: reopening the same persisted workspace in a second TUI process produced the same enabled exploration snapshot.
+  - Divergence: none observed after the fix; fresh and reopened sessions now agree.
 - Root cause hypothesis:
   - Type: `in_memory_projection_bug`
-  - Hypothesis: `src/core/exploration/status.ts` short-circuits on `loadExplorationConfig().enabled === false`, and `loadExplorationConfig()` currently reads only the repo-default YAML (`src/config/exploration.default.yaml`) instead of any run/workspace/runtime seam. That prevents the live status surfaces from reading persisted exploration artifacts even when they exist.
-- Code/test changes: none yet; this entry records the first real live-validation reproduction after the exploration status surfaces landed.
+  - Hypothesis: `src/core/exploration/status.ts` short-circuited on the repo-default `loadExplorationConfig().enabled === false` path, so the live TUI/web surfaces ignored workspace config and runtime config even when persisted `experiment_tree/` artifacts existed.
+- Code/test changes:
+  - Added `resolveExplorationConfig(...)` in `src/core/exploration/explorationConfig.ts` so exploration enablement resolves from workspace `.autolabos/config.yaml` and in-memory runtime config instead of the repo default only.
+  - Updated `src/core/exploration/status.ts`, `src/core/nodes/designExperiments.ts`, `src/core/nodes/analyzeResults.ts`, `src/core/nodes/figureAudit.ts`, `src/tui/TerminalApp.ts`, `src/interaction/InteractionSession.ts`, `src/web/server.ts`, `src/runtime/createRuntime.ts`, and `src/types.ts` to use the same seam.
+  - Updated regressions in `tests/explorationConfig.test.ts`, `tests/explorationStatus.test.ts`, `tests/figureAuditNode.test.ts`, and `tests/newSlashCommands.test.ts`.
 - Regression status:
-  - Automated regression coverage exists for the enabled path via mocked config in `tests/explorationStatus.test.ts`, `tests/newSlashCommands.test.ts`, and `web/src/App.test.tsx`.
-  - Real live revalidation result: still reproduces.
+  - Automated regression coverage:
+    - `npm run build`
+    - `npm test`
+    - `npm run validate:harness`
+    - `npm run test:web`
+  - Real live revalidation result:
+    - Fresh TUI `/doctor` still surfaces the expected fixture-scope missing artifacts.
+    - Fresh TUI `/explore` now reports the enabled exploration snapshot from persisted artifacts.
+    - Reopened TUI `/explore` reports the same enabled snapshot.
+    - `GET /api/exploration/status?run_id=run-explore-live` now returns `{\"enabled\":true,\"current_stage\":\"main_agenda\",...}`.
+    - `GET /api/bootstrap` remains anchored to the same active run.
 - Follow-up risks:
-  - Operators can be misled into thinking exploration never ran, even when `experiment_tree/` and `figure_audit/` artifacts are present.
-  - The current tests only verify the enabled path through explicit config mocking, so this runtime configuration seam can drift unnoticed.
-  - Direct browser rendering of the web card was not rechecked because the Playwright navigation approval was rejected during this validation loop; the live API behavior was verified instead.
+  - Direct browser rendering of the web card was not rechecked in this loop; the real web API contract was verified instead.
+  - The live fixture still lacks several full-run artifacts, so `/doctor` continues to show expected fixture-scope missing-artifact findings unrelated to the exploration projection fix.
 
 ---
 
diff --git a/src/core/exploration/explorationConfig.ts b/src/core/exploration/explorationConfig.ts
@@ -1,8 +1,10 @@
-import { readFileSync } from "node:fs";
+import { existsSync, readFileSync } from "node:fs";
 import path from "node:path";
 import { fileURLToPath } from "node:url";
 
 import YAML from "yaml";
+import type { AppConfig } from "../../types.js";
+import { resolveAppPaths } from "../../config.js";
 
 export interface ExplorationStageBudgetConfig {
   max_nodes: number;
@@ -73,3 +75,45 @@ export function loadExplorationConfig(configPath?: string): ExplorationConfig {
   const raw = readFileSync(resolvedPath, "utf8");
   return YAML.parse(raw) as ExplorationConfig;
 }
+
+function loadWorkspaceConfigOverride(workspaceRoot: string): Partial<AppConfig> | null {
+  const configPath = resolveAppPaths(workspaceRoot).configFile;
+  if (!existsSync(configPath)) {
+    return null;
+  }
+  try {
+    return YAML.parse(readFileSync(configPath, "utf8")) as Partial<AppConfig>;
+  } catch (error) {
+    const message = error instanceof Error ? error.message : String(error);
+    throw new Error(`Failed to parse workspace config at ${configPath}: ${message}`);
+  }
+}
+
+export function resolveExplorationConfig(options?: {
+  workspaceRoot?: string;
+  appConfig?: Partial<AppConfig> | null;
+}): ExplorationConfig {
+  const base = loadExplorationConfig();
+  const workspaceOverride =
+    options?.appConfig || !options?.workspaceRoot
+      ? null
+      : loadWorkspaceConfigOverride(options.workspaceRoot);
+  const source = options?.appConfig || workspaceOverride;
+  const enabled =
+    source?.runtime?.exploration_enabled
+    ?? source?.exploration?.enabled
+    ?? base.enabled;
+  const figureAuditorEnabled =
+    source?.exploration?.figure_auditor?.enabled
+    ?? base.figure_auditor.enabled;
+
+  return {
+    ...base,
+    enabled,
+    figure_auditor: {
+      ...base.figure_auditor,
+      ...(source?.exploration?.figure_auditor || {}),
+      enabled: figureAuditorEnabled
+    }
+  };
+}
diff --git a/src/core/exploration/status.ts b/src/core/exploration/status.ts
@@ -2,9 +2,10 @@ import path from "node:path";
 import { promises as fs } from "node:fs";
 
 import { loadBaselineLock } from "./baselineLock.js";
-import { loadExplorationConfig } from "./explorationConfig.js";
+import { resolveExplorationConfig } from "./explorationConfig.js";
 import { loadResearchTree, type ResearchTree } from "./researchTree.js";
 import type { ExplorationStage, FigureAuditSummary, ResearchTreeNode } from "./types.js";
+import type { AppConfig } from "../../types.js";
 
 export interface ExplorationStatusSnapshot {
   enabled: boolean;
@@ -139,8 +140,12 @@ function evidenceCompletenessForNode(node: ResearchTreeNode): number {
 export async function buildExplorationStatusSnapshot(options: {
   workspaceRoot: string;
   runId?: string | null;
+  appConfig?: Partial<AppConfig> | null;
 }): Promise<ExplorationStatusSnapshot> {
-  const config = loadExplorationConfig();
+  const config = resolveExplorationConfig({
+    workspaceRoot: options.workspaceRoot,
+    appConfig: options.appConfig
+  });
   if (!config.enabled) {
     return disabledExplorationStatusSnapshot();
   }
diff --git a/src/core/nodes/analyzeResults.ts b/src/core/nodes/analyzeResults.ts
@@ -77,7 +77,7 @@ import {
   lintFigures,
   type FigureAuditInput
 } from "../analysis/figureAuditor.js";
-import { loadExplorationConfig } from "../exploration/explorationConfig.js";
+import { resolveExplorationConfig } from "../exploration/explorationConfig.js";
 import { loadResearchTree } from "../exploration/researchTree.js";
 import { buildWriteupInputManifest } from "../exploration/evidenceSerializer.js";
 
@@ -401,7 +401,10 @@ export function createAnalyzeResultsNode(deps: NodeExecutionDeps): GraphNodeHand
         `${JSON.stringify(riskSignals, null, 2)}\n`
       );
       const resultAnalysisPath = await writeRunArtifact(run, "result_analysis.json", JSON.stringify(summary, null, 2));
-      const explorationFigureConfig = loadExplorationConfig().figure_auditor;
+      const explorationFigureConfig = resolveExplorationConfig({
+        workspaceRoot: process.cwd(),
+        appConfig: deps.config
+      }).figure_auditor;
       if (explorationFigureConfig.enabled) {
         const figureAuditInput = await buildFigureAuditInput({
           runDir: path.join(process.cwd(), ".autolabos", "runs", run.id),
@@ -608,14 +611,10 @@ export function createAnalyzeResultsNode(deps: NodeExecutionDeps): GraphNodeHand
         "run_completeness_checklist.json",
         `${JSON.stringify(completenessChecklist, null, 2)}\n`
       );
-      const runtimeExplorationEnabled = (deps.config as any)?.runtime?.exploration_enabled;
-      const explorationConfig =
-        typeof runtimeExplorationEnabled === "boolean"
-          ? {
-              ...loadExplorationConfig(),
-              enabled: runtimeExplorationEnabled
-            }
-          : loadExplorationConfig();
+      const explorationConfig = resolveExplorationConfig({
+        workspaceRoot: process.cwd(),
+        appConfig: deps.config
+      });
       if (explorationConfig.enabled) {
         const runDir = path.join(process.cwd(), ".autolabos", "runs", run.id);
         const tree = loadResearchTree(runDir);
diff --git a/src/core/nodes/designExperiments.ts b/src/core/nodes/designExperiments.ts
@@ -32,7 +32,7 @@ import { parseMarkdownRunBriefSections } from "../runs/runBriefParser.js";
 import type { MarkdownRunBriefSections } from "../runs/runBriefParser.js";
 import { BriefCompletenessArtifact, buildBriefCompletenessArtifact } from "../runs/researchBriefFiles.js";
 import { buildWorkspaceRunRoot } from "../runs/runPaths.js";
-import { loadExplorationConfig } from "../exploration/explorationConfig.js";
+import { resolveExplorationConfig } from "../exploration/explorationConfig.js";
 import { ExplorationManager } from "../exploration/explorationManager.js";
 
 interface FilteredHypothesis {
@@ -121,7 +121,10 @@ export function createDesignExperimentsNode(deps: NodeExecutionDeps): GraphNodeH
         });
       };
 
-      const explorationConfig = loadExplorationConfig();
+      const explorationConfig = resolveExplorationConfig({
+        workspaceRoot: process.cwd(),
+        appConfig: deps.config
+      });
       if (explorationConfig.enabled) {
         const explorationManager = new ExplorationManager(
           run.id,
diff --git a/src/core/nodes/figureAudit.ts b/src/core/nodes/figureAudit.ts
@@ -9,7 +9,7 @@ import {
   lintFigures,
   type FigureAuditInput
 } from "../analysis/figureAuditor.js";
-import { loadExplorationConfig } from "../exploration/explorationConfig.js";
+import { resolveExplorationConfig } from "../exploration/explorationConfig.js";
 import type { FigureAuditIssue, FigureAuditSummary } from "../exploration/types.js";
 import { buildRunCompletenessChecklist } from "../runs/runCompletenessChecklist.js";
 import { buildRunOperatorStatus } from "../runs/runStatus.js";
@@ -23,7 +23,10 @@ export function createFigureAuditNode(deps: NodeExecutionDeps): GraphNodeHandler
     id: "figure_audit",
     async execute({ run }) {
       const runDir = path.join(process.cwd(), ".autolabos", "runs", run.id);
-      const config = loadExplorationConfig();
+      const config = resolveExplorationConfig({
+        workspaceRoot: process.cwd(),
+        appConfig: deps.config
+      });
       const input = await buildFigureAuditInput(runDir);
 
       let gate1Gate2Issues: FigureAuditIssue[] = [];
diff --git a/src/interaction/InteractionSession.ts b/src/interaction/InteractionSession.ts
@@ -1339,7 +1339,8 @@ export class InteractionSession {
     const targetRunId = this.activeRunId || runs[0]?.id || null;
     const snapshot = await buildExplorationStatusSnapshot({
       workspaceRoot: this.workspaceRoot,
-      runId: targetRunId
+      runId: targetRunId,
+      appConfig: this.config
     });
     for (const line of formatExplorationStatusLines(snapshot)) {
       this.pushLog(line);
diff --git a/src/runtime/createRuntime.ts b/src/runtime/createRuntime.ts
@@ -33,6 +33,7 @@ import { DEFAULT_OLLAMA_BASE_URL } from "../integrations/ollama/modelCatalog.js"
 import { recoverCollectEnrichmentJobs } from "../core/nodes/collectPapers.js";
 import { detectExecutionProfile } from "./executionProfile.js";
 import { resolveNodeOptionsForPackage } from "../core/stateGraph/defaults.js";
+import { loadExplorationConfig } from "../core/exploration/explorationConfig.js";
 
 export interface AutoLabOSRuntime {
   paths: AppPaths;
@@ -84,6 +85,7 @@ export async function bootstrapAutoLabOSRuntime(opts?: {
     runtime: {
       ...(config.runtime || {}),
       execution_profile: executionProfile,
+      exploration_enabled: config.exploration?.enabled ?? loadExplorationConfig().enabled,
       node_option_package: opts?.nodeOptionPackageName,
       resolved_node_options: resolveNodeOptionsForPackage(opts?.nodeOptionPackageName)
     }
diff --git a/src/tui/TerminalApp.ts b/src/tui/TerminalApp.ts
@@ -2272,7 +2272,8 @@ export class TerminalApp {
     const targetRunId = this.activeRunId || runs[0]?.id || null;
     const snapshot = await buildExplorationStatusSnapshot({
       workspaceRoot: process.cwd(),
-      runId: targetRunId
+      runId: targetRunId,
+      appConfig: this.config
     });
     this.clearTransientLogs();
     for (const line of formatExplorationStatusLines(snapshot)) {
diff --git a/src/types.ts b/src/types.ts
@@ -312,11 +312,21 @@ export interface AppConfig {
     runs_dir: string;
     logs_dir: string;
   };
+  exploration?: {
+    enabled?: boolean;
+    figure_auditor?: {
+      enabled?: boolean;
+      block_on_severe_mismatch?: boolean;
+      require_caption_alignment?: boolean;
+      require_reference_alignment?: boolean;
+    };
+  };
   /** Runtime-only environment detection. This is attached in memory and stripped before persisting config.yaml. */
   runtime?: {
     execution_profile?: ExecutionProfile;
     node_option_package?: NodeOptionPackageName;
     resolved_node_options?: NodeOptions;
+    exploration_enabled?: boolean;
   };
 }
 
diff --git a/src/web/server.ts b/src/web/server.ts
@@ -221,7 +221,8 @@ class AutoLabOSWebController {
           || (this.runtime ? (await this.runtime.runStore.listRuns())[0]?.id : undefined);
         const payload: ExplorationStatusResponse = await buildExplorationStatusSnapshot({
           workspaceRoot: this.cwd,
-          runId: fallbackRunId
+          runId: fallbackRunId,
+          appConfig: this.runtime?.config
         });
         return jsonResponse(res, 200, payload);
       }
diff --git a/tests/explorationConfig.test.ts b/tests/explorationConfig.test.ts
@@ -1,7 +1,11 @@
+import path from "node:path";
+import { tmpdir } from "node:os";
+import { mkdtemp, mkdir, writeFile } from "node:fs/promises";
+
 import { describe, expect, it } from "vitest";
 
 import type { ExplorationStage } from "../src/core/exploration/types.js";
-import { loadExplorationConfig } from "../src/core/exploration/explorationConfig.js";
+import { loadExplorationConfig, resolveExplorationConfig } from "../src/core/exploration/explorationConfig.js";
 
 describe("explorationConfig", () => {
   it("loads the default exploration config", () => {
@@ -33,4 +37,25 @@ describe("explorationConfig", () => {
       "ablation"
     ]);
   });
+
+  it("reads workspace exploration overrides from .autolabos/config.yaml", async () => {
+    const root = await mkdtemp(path.join(tmpdir(), "autolabos-exploration-config-"));
+    await mkdir(path.join(root, ".autolabos"), { recursive: true });
+    await writeFile(
+      path.join(root, ".autolabos", "config.yaml"),
+      [
+        "exploration:",
+        "  enabled: true",
+        "  figure_auditor:",
+        "    enabled: false",
+        ""
+      ].join("\n"),
+      "utf8"
+    );
+
+    const config = resolveExplorationConfig({ workspaceRoot: root });
+
+    expect(config.enabled).toBe(true);
+    expect(config.figure_auditor.enabled).toBe(false);
+  });
 });
diff --git a/tests/explorationStatus.test.ts b/tests/explorationStatus.test.ts
@@ -89,17 +89,17 @@ describe("exploration status snapshot", () => {
   });
 
   it("builds counts, baseline lock, evidence completeness, and figure audit status from persisted artifacts", async () => {
-    const baseConfig = loadExplorationConfig();
-    vi.spyOn(explorationConfigModule, "loadExplorationConfig").mockReturnValue({
-      ...baseConfig,
-      enabled: true
-    });
-
     const root = await mkdtemp(path.join(tmpdir(), "autolabos-exploration-status-"));
     const runId = "run-exploration-status";
     const runDir = path.join(root, ".autolabos", "runs", runId);
     await mkdir(path.join(runDir, "experiment_tree"), { recursive: true });
     await mkdir(path.join(runDir, "figure_audit"), { recursive: true });
+    await mkdir(path.join(root, ".autolabos"), { recursive: true });
+    await writeFile(
+      path.join(root, ".autolabos", "config.yaml"),
+      "exploration:\n  enabled: true\n",
+      "utf8"
+    );
 
     const tree = addNode(
       addNode(initResearchTree(runId, runDir), makeTreeNode()),
diff --git a/tests/figureAuditNode.test.ts b/tests/figureAuditNode.test.ts
@@ -284,7 +284,13 @@ describe("figure_audit node integration", () => {
     await writeFile(path.join(runDir, "memory", "run_context.json"), JSON.stringify({ version: 1, items: [] }), "utf8");
 
     const node = createFigureAuditNode({
-      config: {} as any,
+      config: {
+        exploration: {
+          figure_auditor: {
+            enabled: false
+          }
+        }
+      } as any,
       runStore: {} as any,
       eventStream: new InMemoryEventStream(),
       llm: new MockLLMClient(),
diff --git a/tests/newSlashCommands.test.ts b/tests/newSlashCommands.test.ts