Skip to content

feat: Add vectordb-compare app and fix benchmark measurement discrepancies#260

Closed
luisremis wants to merge 1 commit into
mainfrom
app/vectordb-compare
Closed

feat: Add vectordb-compare app and fix benchmark measurement discrepancies#260
luisremis wants to merge 1 commit into
mainfrom
app/vectordb-compare

Conversation

@luisremis

Copy link
Copy Markdown
Contributor

Description

This PR ports the vectordb-compare app from the internal workflows repository and fixes critical discrepancies in how the benchmarks measured database latency:

Fixes

  1. KNN Query Generation Out-of-Timer: Previously, the query payload generation (e.g. query_vector = generator[i]) happened inside the measured start_time - end_time block for most databases, and in the case of ApertureDB it was being evaluated twice per iteration. This PR moves the payload extraction outside the time.time() block for all databases (ApertureDB, Pinecone, Qdrant, Weaviate), guaranteeing that the benchmark accurately measures database query latency, unpolluted by Python data generation overhead.
  2. Ingestion Timing Synchronization: Fixed ingestion measurements so that data preparation is included symmetrically across all databases:
    • LanceDB: Moved the records construction inside the time.time() block.
    • Qdrant: Changed wait=False to wait=True on the upsert call to ensure the benchmark measures synchronous persistence, making it fair when compared to Weaviate, Pinecone, and ApertureDB.

These fixes make the performance results across all vectors databases truly comparable.

Copilot AI review requested due to automatic review settings June 16, 2026 04:02
@luisremis luisremis closed this Jun 16, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new apps/vectordb-compare benchmark workflow that ingests, verifies, and runs KNN latency/throughput comparisons across multiple vector database engines (ApertureDB, Pinecone, Weaviate, Qdrant, LanceDB), with changes intended to make timing measurements more comparable across engines.

Changes:

  • Adds a full benchmark app (Dockerfile/compose, config, ingestion, verification, KNN runners, dataset download tooling).
  • Implements per-engine ingestion + verification modules with dynamic imports to allow partial dependency installation.
  • Updates KNN workers so query payload extraction happens outside the timed block (to avoid measuring Python data generation overhead).

Reviewed changes

Copilot reviewed 35 out of 35 changed files in this pull request and generated 28 comments.

Show a summary per file
File Description
apps/vectordb-compare/test.sh CI-style docker test runner for the workflow
apps/vectordb-compare/requirements.txt Python dependencies for benchmark/engines
apps/vectordb-compare/README.md Usage and methodology documentation for the benchmark app
apps/vectordb-compare/Dockerfile Container build for the workflow image
apps/vectordb-compare/compose.yml Local docker compose runner
apps/vectordb-compare/.env.sample Sample environment configuration
apps/vectordb-compare/.dockerignore Docker ignore rules for local artifacts
apps/vectordb-compare/app/app.sh Entrypoint script orchestrating download/ingest/verify/knn/plot
apps/vectordb-compare/app/config.py Centralized config/env/arg parsing
apps/vectordb-compare/app/utils.py Shared dataset loaders, connectors, and result helpers
apps/vectordb-compare/app/download_data.sh S3 dataset download helper
apps/vectordb-compare/app/hdf5.py HDF5 inspection helper script
apps/vectordb-compare/app/ingest.py Main ingestion runner (dynamic engine loading + timing summary)
apps/vectordb-compare/app/ingest_base.py Base ingestion engine utilities (sizes/datasets/timing)
apps/vectordb-compare/app/ingest_aperturedb.py ApertureDB ingestion implementation
apps/vectordb-compare/app/ingest_pinecone.py Pinecone ingestion implementation
apps/vectordb-compare/app/ingest_weaviate.py Weaviate ingestion implementation
apps/vectordb-compare/app/ingest_qdrant.py Qdrant ingestion implementation
apps/vectordb-compare/app/ingest_lancedb.py LanceDB ingestion implementation
apps/vectordb-compare/app/verify.py Main verification runner + timing summaries
apps/vectordb-compare/app/verify_base.py Base verification engine utilities
apps/vectordb-compare/app/verify_aperturedb.py ApertureDB ingestion verification
apps/vectordb-compare/app/verify_pinecone.py Pinecone ingestion verification
apps/vectordb-compare/app/verify_weaviate.py Weaviate ingestion verification
apps/vectordb-compare/app/verify_qdrant.py Qdrant ingestion verification
apps/vectordb-compare/app/verify_lancedb.py LanceDB ingestion verification
apps/vectordb-compare/app/knn.py Main KNN benchmark runner (dynamic engine loading)
apps/vectordb-compare/app/knn_base.py Base KNN engine + query generator logic
apps/vectordb-compare/app/knn_aperturedb.py ApertureDB KNN implementation
apps/vectordb-compare/app/knn_pinecone.py Pinecone KNN implementation
apps/vectordb-compare/app/knn_weaviate.py Weaviate KNN implementation
apps/vectordb-compare/app/knn_qdrant.py Qdrant KNN implementation
apps/vectordb-compare/app/knn_lancedb.py LanceDB KNN implementation
apps/vectordb-compare/app/plot.py Plot generation wrapper using dbeval
apps/vectordb-compare/app/test.sh Local test helper script

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +58 to +64
if dataset == "deepimage96":
dataset_obj = u.DatasetDeepImage96(max=n_queries)
self.dataset = dataset_obj.test # Use test vectors for queries
elif dataset == "yfcc100m":
dataset_obj = u.DatasetYFCC100M(max=n_queries)
self.dataset = dataset_obj.test # Use test vectors for queries

Comment on lines +83 to +92
# Load query dataset - will now use TEST vectors via updated base class
if params.dataset == "deepimage96":
query_data = u.DatasetDeepImage96(
max=params.total_queries).test
elif params.dataset == "yfcc100m":
query_data = u.DatasetYFCC100M(max=params.total_queries).test
else:
raise ValueError(f"Unknown dataset: {params.dataset}")

print(f"Using {len(query_data)} test vectors as queries")
Comment on lines +27 to +34
def worker_pinecone(index, generator, namespace, knn_samples, start_index, end_index, times, results):
"""Worker function for Pinecone threading."""
for i in range(start_index, end_index + 1):
if i >= len(generator):
break

query_vector = generator[i]

Comment on lines +76 to +85
th_queue_size = len(generator) // c

index = pc.Index(index_name)
for i in range(c):
start_index = i * len(generator) // c
end_index = min(
start_index + th_queue_size, len(generator))

t = threading.Thread(target=worker_pinecone, args=(
index, generator, namespace, params.knn_samples, start_index, end_index, times, results))
Comment on lines +27 to +34
def worker_qdrant(client, generator, collection_name, knn_samples, start_index, end_index, times, results):
"""Worker function for Qdrant threading."""
for i in range(start_index, end_index + 1):
if i >= len(generator):
break

query_vector = generator[i]

Comment on lines +56 to +59
- **`ingest.py`** - Data ingestion pipeline with engine selection
- **`verify_ingestion.py`** - Verify data was loaded correctly
- **`knn.py`** - KNN performance benchmarking
- **`utils.py`** - Shared utilities with dynamic imports

1. **Data Ingestion**:
```bash
python ingest.py -engines "adb,pc,wv,qd,ldb" -source "deepimage96"

2. **Verify Ingestion**:
```bash
python verify_ingestion.py -engines "adb,pc,wv,qd,ldb" -source "deepimage96"
| Variable | Description | Default |
|----------|-------------|---------|
| `ENGINES` | Comma-separated engine list | `"adb,pc,wv,qd,ldb"` |
| `SOURCE` | Dataset source | `"deepimage96"` |
#!/bin/bash
set -e

mkdir -p input
@luisremis luisremis deleted the app/vectordb-compare branch June 16, 2026 04:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants