Motivation
This refactor introduces a graph-based structure to the existing imperative kaggle() workflow, allowing:
Modular execution of each agent and integration step.
State tracking for each stage (via LangGraph's shared state system).
Future extensibility , including:
Plug-and-play memory retrieval (fast/slow mem),
Multi-turn Kaggle resolvers (this single turn serves as a subgraph)
Streaming intermediate results into a frontend or CLI interface.
Human-in-the-Loop support
LangGraph’s compositional structure allows better management of agent workflows as dynamic graphs, which aligns with long-term goals like auto-correction loops and human-agent collaboration.
Flow
┌───> 0. Init & shared state
│
├───> 1. Resume-decision (user)
│ │
│ └♦ cleans cache if resuming
│
├───> 2. Pick-competition & Download-dataset (Kaggle API)
│
├───> 3. Fetch-overview & Summarise (Kaggle API → Summary agent)
│
├───> 4. Advisor high-level report (Advisor agent)
│
├───> 5. Generate dev-plan (Planner agent)
│
└───> 6. Coding-execution loop
├─ 6a. Code task (Code agent)
├─ 6b. Debug-decision (user flag + code_report)
└─ 6c. Debug loop (Debug agent) ─► back to 6a if needed
flowchart TD
%% ===== Main linear flow =====
INIT[0 Init / shared state]
RESUME["1 Resume-Decision (User)"]
PICK[2 Select-Competition & Download]
OVERVIEW[3 Fetch-Overview & Summarise]
ADVISE[4 Advisor Report]
PLAN[5 Generate Dev-Plan]
CODELOOP["6 Coding-Execution (Sub-graph)"]
INIT --> RESUME --> PICK --> OVERVIEW --> ADVISE --> PLAN --> CODELOOP
%% ===== Coding / Debugging sub-graph =====
subgraph CODELOOP [6 Coding-Execution loop]
direction LR
START_T[Start next task]
CODE[6a. Code Task]
DECIDE{{6b. Need Debug?}}
DEBUG[6c. Debug Loop]
END_T[Task done]
START_T --> CODE
CODE --> DECIDE
DECIDE -- "auto-mode & debug=true" --> DEBUG
DECIDE -- "otherwise" --> END_T
DEBUG -- "fixed / success" --> END_T
DEBUG -- "fail → recode" --> CODE
END_T -->|next task| START_T
end
Loading
Node catalogue
#
Node name
Category
Tool / Agent invoked
Key Inputs
Outputs stored in state
Fan-out / Routing
0
Init
utility
load_model, WorkflowCache, KaggleIntegration, Console
work_dir, model?
model, cache, integration
always → 1
1
ResumeDecision
task
questionary.text
cache
resume_step (int/None)
→ 2 after optional cache-trim
2
SelectCompetition
tool
integration.list_competition, questionary.select, integration.download_competition_dataset
resume_step, cache
competition, dataset_path
→ 3
3
OverviewSummary
tool
integration.fetch_competition_overview, GitHubSummaryAgent.kaggle_request_summarize
competition, dataset_path
ml_requirement
→ 4
4
AdvisorReport
tool
AdviseAgent.interact
ml_requirement, dataset_path
advisor_report
→ 5
5
PlanGeneration
tool
PlanAgent.interact
advisor_report
coding_plan (list of tasks)
→ 6
6
CodingOrchestrator
task (sub-graph)
– orchestrates 6a-6c
coding_plan, advisor_report
final_code_reports
→ END
6a
CodeTask
tool
CodeAgent.interact or .debug(...)
current_task, advisor_report
code_report
→ 6b
6b
DebugDecision
router
questionary.confirm (asked once) & value of code_report.debug
code_report, is_auto_mode
boolean
if need-debug → 6c else next task / END
6c
DebugLoop
tool
DebugAgent.analyze → CodeAgent.debug
code_report
updated code_report
success → next task, fail → back to 6a
Edges & guards
Sequential edges: 0→1→2→3→4→5→6.
Conditional edges inside 6 (coding loop):
DebugDecision routes either to CodeTask (new iteration) or to next-task/finish based on:
is_auto_mode = True AND code_report.debug == true → 6c
else → next task.
DebugLoop re-enters CodeTask with the updated code_report.
Shared state schema
{
"model" : …,
"cache" : WorkflowCache,
"integration" : KaggleIntegration,
"competition" : str,
"dataset_path" : str,
"ml_requirement" : str,
"advisor_report" : str,
"coding_plan" : { "tasks" : [ … ] },
"is_auto_mode" : bool,
"code_report" : dict, # transient per iteration
"final_code_reports" : list # aggregated after loop
}
Motivation
This refactor introduces a graph-based structure to the existing imperative
kaggle()workflow, allowing:Modular execution of each agent and integration step.
State tracking for each stage (via LangGraph's shared state system).
Future extensibility, including:
LangGraph’s compositional structure allows better management of agent workflows as dynamic graphs, which aligns with long-term goals like auto-correction loops and human-agent collaboration.
Flow
flowchart TD %% ===== Main linear flow ===== INIT[0 Init / shared state] RESUME["1 Resume-Decision (User)"] PICK[2 Select-Competition & Download] OVERVIEW[3 Fetch-Overview & Summarise] ADVISE[4 Advisor Report] PLAN[5 Generate Dev-Plan] CODELOOP["6 Coding-Execution (Sub-graph)"] INIT --> RESUME --> PICK --> OVERVIEW --> ADVISE --> PLAN --> CODELOOP %% ===== Coding / Debugging sub-graph ===== subgraph CODELOOP [6 Coding-Execution loop] direction LR START_T[Start next task] CODE[6a. Code Task] DECIDE{{6b. Need Debug?}} DEBUG[6c. Debug Loop] END_T[Task done] START_T --> CODE CODE --> DECIDE DECIDE -- "auto-mode & debug=true" --> DEBUG DECIDE -- "otherwise" --> END_T DEBUG -- "fixed / success" --> END_T DEBUG -- "fail → recode" --> CODE END_T -->|next task| START_T endNode catalogue
load_model,WorkflowCache,KaggleIntegration,Consolework_dir,model?model,cache,integrationquestionary.textcacheresume_step(int/None)integration.list_competition,questionary.select,integration.download_competition_datasetresume_step,cachecompetition,dataset_pathintegration.fetch_competition_overview,GitHubSummaryAgent.kaggle_request_summarizecompetition,dataset_pathml_requirementAdviseAgent.interactml_requirement,dataset_pathadvisor_reportPlanAgent.interactadvisor_reportcoding_plan(list of tasks)coding_plan,advisor_reportfinal_code_reportsCodeAgent.interactor.debug(...)current_task,advisor_reportcode_reportquestionary.confirm(asked once) & value ofcode_report.debugcode_report,is_auto_modeDebugAgent.analyze→CodeAgent.debugcode_reportcode_reportEdges & guards
Sequential edges: 0→1→2→3→4→5→6.
Conditional edges inside 6 (coding loop):
DebugDecisionroutes either toCodeTask(new iteration) or tonext-task/finishbased on:is_auto_mode = TrueANDcode_report.debug == true→ 6cDebugLoopre-enters CodeTask with the updatedcode_report.Shared state schema
{ "model": …, "cache": WorkflowCache, "integration": KaggleIntegration, "competition": str, "dataset_path": str, "ml_requirement": str, "advisor_report": str, "coding_plan": { "tasks": [ … ] }, "is_auto_mode": bool, "code_report": dict, # transient per iteration "final_code_reports": list # aggregated after loop }