ResearchKit is layered on top of Overleaf Community Edition. It treats Overleaf as the IDE and the LaTeX paper project as the codebase. The Overleaf module owns UI integration and project-file capture, while the Python service owns agent execution, memory, provider configuration, and persistence. A separate runner service executes shell commands against an overlay workspace and returns structured diffs.
Overview image of the current ResearchKit architecture. Overleaf is the IDE surface, and the LaTeX paper workspace is the codebase the agents operate on.
Browser
-> Overleaf web app
-> ResearchKit React panel and rail entry
-> Express proxy routes in services/web/modules/researchkit
-> ResearchKit FastAPI service
-> MainAgent
-> ResearchAgent / FigureAgent / ReviewAgent
-> MemoryManager
-> provider registry
-> MongoDB
-> ResearchKit runner
- The frontend lives in
services/web/modules/researchkit/frontend/. - The proxy/controller layer lives in
services/web/modules/researchkit/app/src/. - Conceptually, this layer turns Overleaf into the IDE surface for ResearchKit.
- Before sending chat or indexing requests, the controller flushes project documents and builds a file snapshot from Overleaf's internal document storage.
- The controller can override the active file content with unsaved in-editor text so the backend sees the latest user state.
- ResearchKit treats the active paper like a codebase rather than a single document.
main.tex, included section files,.bibfiles, and figure assets form the working project tree the agents reason over.- This framing drives the tool model:
- editor context is captured like IDE state
- file operations are explicit and patch-oriented
- workspace inspection and command execution are treated similarly to coding-agent workflows
- Entry point:
services/researchkit/researchkit/main.py - Routes:
services/researchkit/researchkit/api/routes.py - Schemas:
services/researchkit/researchkit/api/models.py - The service exposes health, chat, memory, conversation, config, config-test, and model-discovery endpoints under
/api. - Chat responses are streamed back as SSE events.
MainAgentis the primary orchestrator for editor-facing requests.- It loads project memory into the system prompt, appends workspace context, restores the scoped conversation, and runs a tool loop through the configured provider.
- It reasons about the paper workspace the way a coding agent reasons about a project repository.
- Tool surface:
str_replace_editorfor viewing and editing workspace filesbashfor execution-oriented commands onlydelegate_to_subagentfor specialized workflows
- Important constraint: file inspection and file mutation are expected to go through
str_replace_editor, notbash.
researchis implemented and supports literature search, citation verification, BibTeX generation, and read-only workspace inspection.figureexists as a placeholder.reviewexists as a placeholder.
MemoryManagerbuilds and retrievesPaperMemory.- Current extraction sources:
- document structure from LaTeX sections
- venue hints from
\documentclass - abstract extraction
- citations from
.bibfiles
- Memory is rebuilt when the project content hash changes.
- If no abstract is present, summary generation falls back to the configured LLM provider.
ProviderConfigis assembled from three levels:- environment defaults
- per-project MongoDB overrides
- request-level overrides
- Supported provider types:
openaianthropiccustomfor OpenAI-compatible endpoints
- Project config supports saved encrypted API keys, model selection, runner settings, and workspace configuration.
- Entry point:
services/researchkit/researchkit/runner/main.py - The runner creates a temporary workspace, optionally copies a mounted baseline workspace, overlays request files, executes the command, snapshots before/after state, and returns changed files.
- This lets the Main Agent run command-based workflows while still returning explicit patchable file changes to the UI.
- The UI sends the message plus editor context to the Express proxy.
- The proxy flushes docs to MongoDB, snapshots project files, and forwards the request to
POST /api/chat. - The backend loads config, rebuilds memory if needed, and starts
MainAgent.handle(...). - The Main Agent emits typed events while tools run.
- The API converts those internal events to SSE events:
messageactionpatchresponsedone
- The UI or proxy calls
POST /api/project/index. MemoryManager.build_memory(...)resolves\input{}trees, parses sections and citations, and writesresearchkitMemory.- Subsequent chat requests reuse that memory until the project hash changes.
- The UI loads config from
GET /api/config/{project_id}. - Updates are saved through
POST /api/config/{project_id}. - Provider health checks use
POST /api/config/{project_id}/test. - Model discovery uses
POST /api/models/{project_id}.
- The Main Agent calls
delegate_to_subagent. - The selected sub-agent receives
Task,PaperMemory, andSubAgentContext. - If the delegated work changes files, the Main Agent converts the before/after workspace snapshots into structured patches for the frontend.
ResearchKit currently stores data in three MongoDB collections inside the shared Overleaf database:
researchkitMemoryresearchkitConversationsresearchkitConfig
bashis for execution-oriented commands such as tests or builds, not repo inspection.- Workspace editing is gated by configured workspace paths and allowed workspace roots.
- Runner execution happens in a temporary overlay workspace rather than mutating the source tree directly.
- Conversation history is scoped by
project_idandconversation_id, with compatibility logic for older default records.
