[Bug]index_data_points: shallow copy of metadata dict causes only first index_field to be embedded

## Title

`index_data_points`: shallow copy of `metadata` dict causes only first `index_field` to be embedded

## Description

When a custom `DataPoint` defines multiple `index_fields` (e.g., `["problem", "conclusion", "follow_up"]`), only the first field gets properly embedded. The remaining collections are created but contain embeddings and text from the first field.

### Root Cause

In `cognee/tasks/storage/index_data_points.py`, lines 36-48:

```python
for field_name in data_point.metadata["index_fields"]:  # iterates over original list
    # ...
    indexed_data_point = data_point.model_copy()  # shallow copy
    indexed_data_point.metadata["index_fields"] = [field_name]  # mutates the ORIGINAL dict!
```

`model_copy()` (Pydantic v2) performs a **shallow copy** of dict fields. Since `indexed_data_point.metadata` and `data_point.metadata` point to the **same dict object**, the assignment `indexed_data_point.metadata["index_fields"] = [field_name]` mutates the original `data_point.metadata["index_fields"]` from `["problem", "conclusion", "follow_up"]` to `["problem"]`.

This causes the `for` loop to terminate after the first iteration, as the list it's iterating over has been truncated to a single element.

### Reproduction

```python
from cognee.infrastructure.engine import DataPoint
from cognee.tasks.storage import add_data_points
import cognee

class MyCase(DataPoint):
    problem: str = ""
    conclusion: str = ""
    metadata: dict = {"index_fields": ["problem", "conclusion"]}

async def main():
    await cognee.prune.prune_data()
    await cognee.prune.prune_system(metadata=True)

    from cognee.modules.engine.operations.setup import setup
    await setup()

    cases = [
        MyCase(problem="PROBLEM_AAA", conclusion="CONCLUSION_BBB"),
    ]
    await add_data_points(cases)

    # Check: both collections have the same content (should be different)
    from cognee.infrastructure.databases.vector import get_vector_engine
    engine = get_vector_engine()
    conn = await engine.get_connection()
    for coll in ["MyCase_problem", "MyCase_conclusion"]:
        table = await conn.open_table(coll)
        data = await table.to_arrow()
        text = data.column("payload")[0].as_py().get("text", "")
        print(f"{coll}: {text}")
        # Both print "CONCLUSION_BBB" — should be "PROBLEM_AAA" and "CONCLUSION_BBB" respectively

import asyncio
asyncio.run(main())
```

### Minimal verification of the shallow copy issue

```python
from cognee.infrastructure.engine import DataPoint

class TestDP(DataPoint):
    a: str = ""
    b: str = ""
    metadata: dict = {"index_fields": ["a", "b"]}

dp = TestDP(a="AAA", b="BBB")
clone = dp.model_copy()
clone.metadata["index_fields"] = ["b"]

print(dp.metadata["index_fields"])
# Output: ["b"]  — original object was mutated!
# Expected: ["a", "b"]
```

### Suggested Fix

Two-line change in `cognee/tasks/storage/index_data_points.py`:

```diff
-        for field_name in data_point.metadata["index_fields"]:
+        for field_name in list(data_point.metadata["index_fields"]):
             # ...
             indexed_data_point = data_point.model_copy()
+            indexed_data_point.metadata = dict(data_point.metadata)
             indexed_data_point.metadata["index_fields"] = [field_name]
```

1. `list(...)` creates a copy of the field names list before iteration, preventing the loop from being truncated.
2. `dict(...)` creates a shallow copy of the metadata dict, preventing mutation of the original DataPoint's metadata.

Using `model_copy(deep=True)` would also work but is unnecessarily expensive since only the metadata dict needs isolation.

### Environment

- cognee version: 0.5.5
- Python: 3.12
- Pydantic: v2
- Vector DB: LanceDB

### Impact

Any user defining a custom DataPoint with multiple `index_fields` will silently get incorrect embeddings — all vector collections will contain the same embeddings from the first field only. This makes multi-field semantic search ineffective.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]index_data_points: shallow copy of metadata dict causes only first index_field to be embedded #2529

Title

Description

Root Cause

Reproduction

Minimal verification of the shallow copy issue

Suggested Fix

Environment

Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]index_data_points: shallow copy of metadata dict causes only first index_field to be embedded #2529

Description

Title

Description

Root Cause

Reproduction

Minimal verification of the shallow copy issue

Suggested Fix

Environment

Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions