Bug Description
Hi Cognee Team,
I'd like to report a bug encountered during the cognee.search() / graph projection phase.
Summary:
The extract_graph_from_data task generates two nodes with the same ID (b5866225-05ad-5cfc-908e-c22916c6a1c6) but different types (EntityType vs Entity). This causes an EntityAlreadyExistsError (409) during graph retrieval, resulting in empty context being passed to the LLM — making search completely non-functional.
Evidence from PostgreSQL:
Row | ID | Type | Name | Created At
-- | -- | -- | -- | --
1 | b5866225-05ad-5cfc-908e-c22916c6a1c6 | EntityType | institution | 2026-03-28 17:40:44
2 | b5866225-05ad-5cfc-908e-c22916c6a1c6 | Entity | institution | 2026-03-28 18:04:17
when I cognee search:
Error logs:
EntityAlreadyExistsError: Node with id b5866225-05ad-5cfc-908e-c22916c6a1c6 already exists. (Status code: 409)
Error during graph projection: EntityAlreadyExistsError...
Error during memory fragment creation: EntityAlreadyExistsError...
Empty context was provided to the completion [GraphCompletionRetriever]
Impact:
This is a data corruption issue. Once it occurs, search returns empty/incorrect results.
For production environments with large datasets (costly to rebuild), this is a critical bug — the only current workaround is prune_system() and full re-processing, which is not acceptable at scale.
Expected behavior:
The system should either:
Use unique IDs for nodes of different types, or
Handle upsert/deduplication properly when the same name appears as both Entity and EntityType.
Environment:
Cognee version: [your version]
Graph store: PostgreSQL
LLM: openai/gemini-2.5-flash-lite (via LiteLLM proxy)
Thank you for looking into this.
Steps to Reproduce
- Run cognee docker with postgreDB (relational), FalkorDB (vector, graph)
- Add a big data
- Cognify
- Search
Expected Behavior
Good search
Actual Behavior
it not retrieve data from graph, it return empty, although the information is in graph.
Environment
- Window/MacOS
- Python 3.11
- Cognee: 0.5.5
- LLM: vertex_ai/gemini-2.5-flash | vertex_ai/gemini-2.5-flash-lite
- Embedding: vertex_ai/text-multilingual-embedding-00
Logs/Error Messages
2026-03-28T11:31:20.563193 [info ] Vector collection retrieval completed: Retrieved distances from 5 collections in 3.11s [cognee.shared.logging_utils]
2026-03-28T11:31:20.564109 [info ] Retrieving full graph. [CogneeGraph]
2026-03-28T11:31:24.667301 [error ] EntityAlreadyExistsError: Node with id b5866225-05ad-5cfc-908e-c22916c6a1c6 already exists. (Status code: 409) [cognee.shared.logging_utils]
2026-03-28T11:31:24.667930 [error ] Error during graph projection: EntityAlreadyExistsError: Node with id b5866225-05ad-5cfc-908e-c22916c6a1c6 already exists. (Status code: 409) [CogneeGraph]
2026-03-28T11:31:24.668369 [error ] Error during memory fragment creation: EntityAlreadyExistsError: Node with id b5866225-05ad-5cfc-908e-c22916c6a1c6 already exists. (Status code: 409) [cognee.shared.logging_utils]
2026-03-28T11:31:24.710218 [warning ] Empty context was provided to the completion [GraphCompletionRetriever]
2026-03-28T11:31:24.710986 [warning ] Empty context was provided to the completion [GraphCompletionRetriever]
2026-03-28T11:31:24.752756 [debug ] Model not found in LiteLLM's model_cost. [cognee.shared.logging_utils]
Patching `client.chat.completions.create` with mode=<Mode.JSON_SCHEMA: 'json_schema_mode'>
Patching `client.chat.completions.create` with mode=<Mode.TOOLS: 'tool_call'>
Patching `client.chat.completions.create` with mode=<Mode.TOOLS: 'tool_call'>
Instructor Request: mode.value='tool_call', response_model=<class 'instructor.dsl.simple_type.Response'>, new_kwargs={'messages': [{'role': 'user', 'content': 'The question is: `a-b trust?`\nand here is the context provided with a set of relationships from a knowledge graph separated by \\n---\\n each represented as node1 -- relation -- node2 triplet: ``'}, {'role': 'system', 'content': 'Answer the question using the provided context. Be as brief as possible.'}], 'model': 'openai/gemini-2.5-flash-lite', 'api_key': 'sk-VsEX3PTDVM-YbmbW3xO-aw', 'api_base': 'http://host.docker.internal:4000', 'api_version': None, 'tools': [{'type': 'function', 'function': {'name': 'Response', 'description': 'Correctly Formatted and Extracted Response.', 'parameters': {'properties': {'content': {'title': 'Content', 'type': 'string'}}, 'required': ['content'], 'type': 'object'}}}], 'tool_choice': {'type': 'function', 'function': {'name': 'Response'}}}
max_retries: 5, timeout: None
Retrying, attempt: 1
Instructor Raw Response: ModelResponse(id='DbzHaYTHCoiPtPUP06y0mQ4', created=1774697485, model='gemini-2.5-flash-lite', object='chat.completion', system_fingerprint=None, choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content=None, role='assistant', tool_calls=[ChatCompletionMessageToolCall(index=0, function=Function(arguments='{"content": "I am sorry, but I cannot answer your question based on the provided context. The knowledge graph does not contain information about \\"a-b trust\\". "}', name='Response'), id='call_204ad318cd334c219503f58c261c', type='function')], function_call=None, images=[], thinking_blocks=[], provider_specific_fields={'refusal': None, 'thinking_blocks': []}), provider_specific_fields={})], usage=CompletionUsage(completion_tokens=33, prompt_tokens=75, total_tokens=108, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=None, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=None), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)), service_tier=None, vertex_ai_grounding_metadata=[], vertex_ai_url_context_metadata=[], vertex_ai_safety_results=[], vertex_ai_citation_metadata=[])
Returning model from AdapterBase
Additional Context
No response
Pre-submission Checklist
Bug Description
Hi Cognee Team,
I'd like to report a bug encountered during the cognee.search() / graph projection phase.
Summary:
The extract_graph_from_data task generates two nodes with the same ID (b5866225-05ad-5cfc-908e-c22916c6a1c6) but different types (EntityType vs Entity). This causes an EntityAlreadyExistsError (409) during graph retrieval, resulting in empty context being passed to the LLM — making search completely non-functional.
Evidence from PostgreSQL:
Row | ID | Type | Name | Created At -- | -- | -- | -- | -- 1 | b5866225-05ad-5cfc-908e-c22916c6a1c6 | EntityType | institution | 2026-03-28 17:40:44 2 | b5866225-05ad-5cfc-908e-c22916c6a1c6 | Entity | institution | 2026-03-28 18:04:17when I cognee search:
Error logs:
Impact:
This is a data corruption issue. Once it occurs, search returns empty/incorrect results.
For production environments with large datasets (costly to rebuild), this is a critical bug — the only current workaround is prune_system() and full re-processing, which is not acceptable at scale.
Expected behavior:
The system should either:
Use unique IDs for nodes of different types, or
Handle upsert/deduplication properly when the same name appears as both Entity and EntityType.
Environment:
Cognee version: [your version]
Graph store: PostgreSQL
LLM: openai/gemini-2.5-flash-lite (via LiteLLM proxy)
Thank you for looking into this.
Steps to Reproduce
Expected Behavior
Good search
Actual Behavior
it not retrieve data from graph, it return empty, although the information is in graph.
Environment
Logs/Error Messages
Additional Context
No response
Pre-submission Checklist