Edge deduplication in retrieve_existing_edges is non-functional (overwrite, key format, ID normalization)

## Bug Description

`retrieve_existing_edges()` has three compounding issues that make edge deduplication non-functional during graph expansion:

### 1. Overwrite instead of accumulate

`graph_node_edges` is reassigned inside the chunk loop, so only the last chunk's edges are queried against the graph DB:

```python
# retrieve_existing_edges.py L64-67
for index, data_chunk in enumerate(data_chunks):
    graph = chunk_graphs[index]
    # ...
    graph_node_edges = [  # ← reassigned each iteration, should be .extend()
        (edge.target_node_id, edge.source_node_id, edge.relationship_name)
        for edge in graph.edges
    ]
```

### 2. Key format mismatch

The producer builds keys by plain concatenation:
```python
# retrieve_existing_edges.py L81
existing_edges_map[str(edge[0]) + str(edge[1]) + str(edge[2])] = True
# e.g. "abc123def456mentioned_in"
```

But the consumer uses underscore-separated keys:
```python
# expand_with_nodes_and_edges.py L26-28
def _create_edge_key(source_id, target_id, relationship_name):
    return f"{source_id}_{target_id}_{relationship_name}"
# e.g. "abc123_def456_mentioned_in"
```

These never match, making dedup a no-op.

### 3. Missing ID normalization

The producer uses raw `edge.source_node_id` / `edge.target_node_id`, while the consumer normalizes them through `generate_node_id()` (lowercases, strips spaces, hashes to UUID5) and `generate_edge_name()`. Even with the separator fix, keys still can't match because the ID formats differ.

## Impact

- Multi-chunk documents lose dedup for all but the last chunk's content edges
- All content-edge dedup checks fail → duplicate edges accumulate in the graph database on every `cognify()` run

## Expected Behavior

All chunks' graph edges should participate in the existence query, and the resulting map keys should use the same format and normalization as the consumer.

## Environment

- Branch: `dev` (commit `452333a`)
- Python 3.12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Edge deduplication in retrieve_existing_edges is non-functional (overwrite, key format, ID normalization) #2557

Bug Description

1. Overwrite instead of accumulate

2. Key format mismatch

3. Missing ID normalization

Impact

Expected Behavior

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Edge deduplication in retrieve_existing_edges is non-functional (overwrite, key format, ID normalization) #2557

Description

Bug Description

1. Overwrite instead of accumulate

2. Key format mismatch

3. Missing ID normalization

Impact

Expected Behavior

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions