Fix: Add transformers for videos and bounding boxes#720
Conversation
There was a problem hiding this comment.
Pull request overview
Adds transformer support for non-image ingestion artifacts (videos and shape annotations) so common/static properties and derived metadata can be applied consistently during transformer-based ingestion pipelines (Fixes #335).
Changes:
- Added
VideoPropertiestransformer to compute basic blob-derived video metadata and ensure_Video.adb_data_sourceis indexed. - Added
BoundingBoxPropertiestransformer to attach annotation-related properties toAddBoundingBoxandAddPolygoncommands. - Extended
Transformer/CommonPropertiesto detect and apply properties toAddVideo,AddBoundingBox, andAddPolygon, and fixedImagePropertiesproperty initialization/blob mapping.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| aperturedb/transformers/video_properties.py | New transformer computing and attaching video metadata properties. |
| aperturedb/transformers/bounding_box_properties.py | New transformer attaching annotation properties to bounding boxes and polygons. |
| aperturedb/transformers/transformer.py | Tracks indices for AddVideo, AddBoundingBox, and AddPolygon commands. |
| aperturedb/transformers/image_properties.py | Fixes property dict initialization and corrects blob indexing logic. |
| aperturedb/transformers/common_properties.py | Applies common/static properties to images, videos, bboxes, and polygons. |
| aperturedb/transformers/init.py | Introduces package exports for transformers (but currently has critical import issues). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
aperturedb/transformers/bounding_box_properties.py:35
- This transformer iterates over
_add_bounding_box_index/_add_polygon_index, which are computed from the first transaction only. If later transactions contain a different number/order ofAddBoundingBox/AddPolygoncommands (common for per-image variable annotation counts), some annotations won’t getannotation_source/annotation_modeapplied, and transactions with fewer annotations can raise IndexError/KeyError (then skip the rest of the updates). Prefer iterating over the current transaction’s command list (x[0]) and applying properties to every command whose key isAddBoundingBoxorAddPolygon.
for ic in getattr(self, "_add_bounding_box_index", []):
src_properties = x[0][ic]["AddBoundingBox"].setdefault(
"properties", {})
if self.annotation_source:
src_properties["annotation_source"] = self.annotation_source
if self.annotation_mode:
src_properties["annotation_mode"] = self.annotation_mode
for ic in getattr(self, "_add_polygon_index", []):
src_properties = x[0][ic]["AddPolygon"].setdefault(
"properties", {})
if self.annotation_source:
src_properties["annotation_source"] = self.annotation_source
if self.annotation_mode:
src_properties["annotation_mode"] = self.annotation_mode
…tion Removes cached add_bounding_box_index and add_polygon_index from the first transaction in Transformer, as these counts frequently vary per-item (e.g. COCO bounding boxes). CommonProperties and BoundingBoxProperties now iterate over the current transactions commands directly, avoiding IndexError when subsequent items have fewer annotations, and ensuring proper property application when they have more. Added a test case to explicitly check behavior with variable annotation counts.
- Remove unused pytest import in test_Transformers.py - Remove caching of AddImage and AddVideo indices in Transformer.__init__ to handle variable per-item counts - Update CommonProperties and ImageProperties to scan for AddVideo dynamically instead of relying on cached indices - Remove unused _blob_index_map from ImageProperties and VideoProperties - Add unit test coverage for VideoProperties to ensure dynamic properties apply correctly
…ally in remaining embeddings
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
aperturedb/transformers/clip_pytorch_embeddings.py:36
- The descriptorset is only created if the first transaction contains an
AddImage. Ifself.data[0]has no images (but later items do), this transformer will still appendAddDescriptorcommands ingetitem, but may never create the descriptor set in the DB, causing ingestion failures. Consider scanning the first few items for anAddImageblob (or lazily initializing on firstAddImageingetitem).
# Let's sample some data to figure out the descriptorset we need.
sample_blob = None
for i, c in enumerate(self.data[0][0]):
if list(c.keys())[0] == "AddImage":
blob_idx = self._blob_index.index(i)
sample_blob = self.data[0][1][blob_idx]
break
if sample_blob is not None:
sample = generate_embedding(sample_blob)
utils = self.get_utils()
utils.add_descriptorset(
self.search_set_name, dim=len(sample) // 4, metric=["CS"])
Addresses review feedback from Copilot
|
Added test coverage for |
luisremis
left a comment
There was a problem hiding this comment.
Summary
This PR successfully adds transformer support for AddVideo, AddBoundingBox, and AddPolygon commands. By iterating over the transaction's commands dynamically rather than relying on cached indices, it robustly handles variable per-item annotation counts and cleanly applies both common properties and specialized metadata.
Findings
- The lazy evaluation of descriptor sets in
CLIPPyTorchEmbeddingsandFacenetPyTorchEmbeddingscorrectly addresses initialization failures while retaining backward compatibility. - Replacing
sys.exit(1)withImportErrorinclip.pyandfacenet.pyimproves modularity and testing. - Tracking
blob_indexdynamically alongside the command checks invideo_properties.py,image_properties.py, and the embedding transformers is much safer for transactions with mixed blob counts. - Exception handling around property application properly catches errors (e.g., corrupt images) without killing the entire ingestion pipeline.
- Tests provide extensive coverage for variable annotation lengths, exception scenarios, and lazy initialization retries.
Verdict
The logic is sound, backward compatibility is maintained, and CI checks are passing. The transition to dynamic transaction scanning is a solid improvement. Since I was asked for a general review without an explicit approval request, I am leaving these notes as a comment.
luisremis
left a comment
There was a problem hiding this comment.
add more testing. for instance, test and verify that initialization failure scenarios in clip/facenet embeddings are thoroughly asserted against different backend conditions.
|
Added unit tests to verify that initialization failure scenarios in clip/facenet embeddings are thoroughly asserted against backend conditions (transient exceptions during get_descriptorset_list), as requested. See commit 5f83f8a. |
|
Addressed feedback from @copilot-pull-request-reviewer[bot]:
Updated in commit 8c9555e. |
|
Addressed feedback from @copilot-pull-request-reviewer[bot]: Noted the reviewer request. The issue described was already resolved in a previous commit by iterating directly over the current transaction commands ( Updated in commit 1b4ebeb |
| src_properties["adb_image_id"] = str( | ||
| src_properties["id"] if "id" in src_properties else uuid.uuid4().hex) |
| src_properties["adb_video_id"] = str( | ||
| src_properties["id"] if "id" in src_properties else uuid.uuid4().hex) |
| try: | ||
| if success or self.search_set_name in utils.get_descriptorset_list(): | ||
| self._descriptorset_initialized = True | ||
| except Exception: | ||
| pass |
| try: | ||
| if success or self.search_set_name in utils.get_descriptorset_list(): | ||
| self._descriptorset_initialized = True | ||
| except Exception: | ||
| pass |
|
Addressed feedback in recent commit |
|
Addressed feedback from @copilot-pull-request-reviewer:
Updated in commit 1b4ebeb |
luisremis
left a comment
There was a problem hiding this comment.
Summary
This PR successfully adds transformer support for AddVideo, AddBoundingBox, and AddPolygon objects, extending the current indexing pipeline to non-image artifacts. It also refactors CommonProperties, ImageProperties, and the embedding transformers to iterate dynamically over transaction commands rather than relying on an index map cached from the first item.
Findings
- Correctness:
- Iterating over each transaction's
cmd_dictcorrectly handles heterogeneous batches where the number of annotations (e.g., bounding boxes) differs per item. - The careful tracking of
blob_indexacross blob-consuming commands (AddImage,AddDescriptor,AddVideo,AddBlob) ensures alignment between JSON commands and binary payload chunks even if an exception occurs mid-transaction. - Hardcoding
dim=512for CLIP and Facenet models instead of deriving it from a sample embedding (len(sample) // 4) during initialization is a clean solution to avoid dependency on an initial image blob.
- Iterating over each transaction's
- Maintainability & Performance:
- Adding
__getattr__intotransformers/__init__.pyproperly implements lazy-loading of transformers, avoiding the penalty of importing heavy libraries (liketorchand PyTorch model weights) unless explicitly invoked. - Safely wrapping the dynamic calculations in
try/exceptwithlogger.exceptionprevents ingestion failures when occasional corrupted blobs or malformed transactions occur.
- Adding
Verdict
The code is solid, handles edge cases gracefully, addresses previous reviewer feedback accurately, and includes comprehensive test coverage. Looks good to go!
Summary
Adds
VideoPropertiesandBoundingBoxPropertiestransformers to handleAddVideo,AddBoundingBox, andAddPolygoncommands.Updates
CommonPropertiesto process these new annotation types.Verification
Manually verified that the property setting works properly for bounding boxes and polygons in the transformer workflow.
Fixes #335