You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The internal SQLite database (cognee_db) grows without bound when Cognee is used as a long-running service with automated search queries. There is no eviction, TTL, or size cap on the results and queries tables, and SQLite does not auto-VACUUM, so the file only grows — never shrinks.
This is a companion to #2538 (unbounded log growth). Same root pattern — write-only accumulation with no rotation — but in the relational store rather than log files.
Impact
On our production system running Cognee as an HTTP daemon with automated monitoring and ingestion pipelines:
Metric
Value
DB file size
28 GB
Period
9 days
Growth rate
~3 GB/day
results rows
42,152
queries rows
42,174
nodes rows
23,450
edges rows
48,029
data rows
1,409
SQLite freelist pages
0 (no reclaimable space)
The 28 GB file filled the disk to 89%, causing memory pressure (the daemon had to swap), degraded search latency, and eventually a cascading service restart.
Root cause
modules/search/operations/log_result.py and log_query.py append a row on every search call with no cleanup:
Result.value is a Text column that stores serialized search results including full node payloads. At ~4,700 queries/day (typical for a daemon with health probes and automated ingestion), the table reaches tens of gigabytes within days.
Additionally, the nodes and edges tables mirror graph data that already lives in the graph DB (FalkorDB/Neo4j), creating a second unbounded copy.
Description
The internal SQLite database (
cognee_db) grows without bound when Cognee is used as a long-running service with automated search queries. There is no eviction, TTL, or size cap on theresultsandqueriestables, and SQLite does not auto-VACUUM, so the file only grows — never shrinks.This is a companion to #2538 (unbounded log growth). Same root pattern — write-only accumulation with no rotation — but in the relational store rather than log files.
Impact
On our production system running Cognee as an HTTP daemon with automated monitoring and ingestion pipelines:
resultsrowsqueriesrowsnodesrowsedgesrowsdatarowsThe 28 GB file filled the disk to 89%, causing memory pressure (the daemon had to swap), degraded search latency, and eventually a cascading service restart.
Root cause
modules/search/operations/log_result.pyandlog_query.pyappend a row on every search call with no cleanup:Result.valueis aTextcolumn that stores serialized search results including full node payloads. At ~4,700 queries/day (typical for a daemon with health probes and automated ingestion), the table reaches tens of gigabytes within days.Additionally, the
nodesandedgestables mirror graph data that already lives in the graph DB (FalkorDB/Neo4j), creating a second unbounded copy.Environment
VECTOR_DB_PROVIDER=falkor(graph + vectors in FalkorDB)Proposed solution
Short-term (cache eviction):
resultsandqueriestables (e.g.,COGNEE_MAX_CACHED_RESULTS=10000,COGNEE_QUERY_HISTORY_TTL_DAYS=30)Medium-term (VACUUM):
PRAGMA auto_vacuum = INCREMENTALto reclaim space from deleted rowsCOGNEE_DB_VACUUM_INTERVALenv var for daemon deploymentsOptional:
COGNEE_LOG_QUERIES=falsefor daemon deployments that don't need search historynodes/edgestables need to exist when an external graph DB is configured — they duplicate data and contribute to growthWorkaround
Manual cleanup (requires stopping the daemon):
Or periodic VACUUM via cron (does not require restart, but only reclaims space from deleted rows):