Skip to content

[Bug]: EntityAlreadyExistsError due to duplicate node ID with different types #2510

@edmbachbach-bot

Description

@edmbachbach-bot

Bug Description

Hi Cognee Team,

I'd like to report a bug encountered during the cognee.search() / graph projection phase.

Summary:
The extract_graph_from_data task generates two nodes with the same ID (b5866225-05ad-5cfc-908e-c22916c6a1c6) but different types (EntityType vs Entity). This causes an EntityAlreadyExistsError (409) during graph retrieval, resulting in empty context being passed to the LLM — making search completely non-functional.

Evidence from PostgreSQL:

Row | ID | Type | Name | Created At -- | -- | -- | -- | -- 1 | b5866225-05ad-5cfc-908e-c22916c6a1c6 | EntityType | institution | 2026-03-28 17:40:44 2 | b5866225-05ad-5cfc-908e-c22916c6a1c6 | Entity | institution | 2026-03-28 18:04:17 Image

when I cognee search:
Error logs:

EntityAlreadyExistsError: Node with id b5866225-05ad-5cfc-908e-c22916c6a1c6 already exists. (Status code: 409)
Error during graph projection: EntityAlreadyExistsError...
Error during memory fragment creation: EntityAlreadyExistsError...
Empty context was provided to the completion [GraphCompletionRetriever]
Image

Impact:

This is a data corruption issue. Once it occurs, search returns empty/incorrect results.
For production environments with large datasets (costly to rebuild), this is a critical bug — the only current workaround is prune_system() and full re-processing, which is not acceptable at scale.
Expected behavior:
The system should either:

Use unique IDs for nodes of different types, or
Handle upsert/deduplication properly when the same name appears as both Entity and EntityType.
Environment:

Cognee version: [your version]
Graph store: PostgreSQL
LLM: openai/gemini-2.5-flash-lite (via LiteLLM proxy)
Thank you for looking into this.

Steps to Reproduce

  1. Run cognee docker with postgreDB (relational), FalkorDB (vector, graph)
  2. Add a big data
  3. Cognify
  4. Search

Expected Behavior

Good search

Actual Behavior

it not retrieve data from graph, it return empty, although the information is in graph.

Environment

  • Window/MacOS
  • Python 3.11
  • Cognee: 0.5.5
  • LLM: vertex_ai/gemini-2.5-flash | vertex_ai/gemini-2.5-flash-lite
  • Embedding: vertex_ai/text-multilingual-embedding-00

Logs/Error Messages

2026-03-28T11:31:20.563193 [info     ] Vector collection retrieval completed: Retrieved distances from 5 collections in 3.11s [cognee.shared.logging_utils]


2026-03-28T11:31:20.564109 [info     ] Retrieving full graph.         [CogneeGraph]


2026-03-28T11:31:24.667301 [error    ] EntityAlreadyExistsError: Node with id b5866225-05ad-5cfc-908e-c22916c6a1c6 already exists. (Status code: 409) [cognee.shared.logging_utils]


2026-03-28T11:31:24.667930 [error    ] Error during graph projection: EntityAlreadyExistsError: Node with id b5866225-05ad-5cfc-908e-c22916c6a1c6 already exists. (Status code: 409) [CogneeGraph]


2026-03-28T11:31:24.668369 [error    ] Error during memory fragment creation: EntityAlreadyExistsError: Node with id b5866225-05ad-5cfc-908e-c22916c6a1c6 already exists. (Status code: 409) [cognee.shared.logging_utils]


2026-03-28T11:31:24.710218 [warning  ] Empty context was provided to the completion [GraphCompletionRetriever]


2026-03-28T11:31:24.710986 [warning  ] Empty context was provided to the completion [GraphCompletionRetriever]


2026-03-28T11:31:24.752756 [debug    ] Model not found in LiteLLM's model_cost. [cognee.shared.logging_utils]


Patching `client.chat.completions.create` with mode=<Mode.JSON_SCHEMA: 'json_schema_mode'>


Patching `client.chat.completions.create` with mode=<Mode.TOOLS: 'tool_call'>


Patching `client.chat.completions.create` with mode=<Mode.TOOLS: 'tool_call'>


Instructor Request: mode.value='tool_call', response_model=<class 'instructor.dsl.simple_type.Response'>, new_kwargs={'messages': [{'role': 'user', 'content': 'The question is: `a-b trust?`\nand here is the context provided with a set of relationships from a knowledge graph separated by \\n---\\n each represented as node1 -- relation -- node2 triplet: ``'}, {'role': 'system', 'content': 'Answer the question using the provided context. Be as brief as possible.'}], 'model': 'openai/gemini-2.5-flash-lite', 'api_key': 'sk-VsEX3PTDVM-YbmbW3xO-aw', 'api_base': 'http://host.docker.internal:4000', 'api_version': None, 'tools': [{'type': 'function', 'function': {'name': 'Response', 'description': 'Correctly Formatted and Extracted Response.', 'parameters': {'properties': {'content': {'title': 'Content', 'type': 'string'}}, 'required': ['content'], 'type': 'object'}}}], 'tool_choice': {'type': 'function', 'function': {'name': 'Response'}}}


max_retries: 5, timeout: None 


Retrying, attempt: 1          


Instructor Raw Response: ModelResponse(id='DbzHaYTHCoiPtPUP06y0mQ4', created=1774697485, model='gemini-2.5-flash-lite', object='chat.completion', system_fingerprint=None, choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content=None, role='assistant', tool_calls=[ChatCompletionMessageToolCall(index=0, function=Function(arguments='{"content": "I am sorry, but I cannot answer your question based on the provided context. The knowledge graph does not contain information about \\"a-b trust\\". "}', name='Response'), id='call_204ad318cd334c219503f58c261c', type='function')], function_call=None, images=[], thinking_blocks=[], provider_specific_fields={'refusal': None, 'thinking_blocks': []}), provider_specific_fields={})], usage=CompletionUsage(completion_tokens=33, prompt_tokens=75, total_tokens=108, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=None, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=None), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)), service_tier=None, vertex_ai_grounding_metadata=[], vertex_ai_url_context_metadata=[], vertex_ai_safety_results=[], vertex_ai_citation_metadata=[])


Returning model from AdapterBase

Additional Context

No response

Pre-submission Checklist

  • I have searched existing issues to ensure this bug hasn't been reported already
  • I have provided a clear and detailed description of the bug
  • I have included steps to reproduce the issue
  • I have included my environment details

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions