Add LW-DETR object detection models#334
Conversation
cregouby
left a comment
There was a problem hiding this comment.
praise again an impressive contribution
todo see inline
| }, | ||
| forward = function(x) { | ||
| patches <- self$embeddings(x) | ||
| B <- patches$size(1L); H <- patches$size(2L); W <- patches$size(3L); C <- patches$size(4L) |
There was a problem hiding this comment.
todo style We prefer not to use multiple instruction in one line. I.e. avoid the ";"
suggestion you can use the zeallot multi affectation %<-% (as soon as patches is always 4-D)
| B <- patches$size(1L); H <- patches$size(2L); W <- patches$size(3L); C <- patches$size(4L) | |
| c(B,H,W,C) %<-% patches$size |
| for (i in seq_along(feats)) { | ||
| f <- feats[[i]] | ||
| h_i <- as.integer(f$size(3L)); w_i <- as.integer(f$size(4L)) | ||
| shapes[[i]] <- c(h_i, w_i) | ||
| lvl_start <- c(lvl_start, cur); cur <- cur + h_i * w_i |
There was a problem hiding this comment.
todo performance lvl_start <- c(lvl_start, cur) reallocates the vector at each iteration which is very costly in performance.
| for (i in seq_along(feats)) { | |
| f <- feats[[i]] | |
| h_i <- as.integer(f$size(3L)); w_i <- as.integer(f$size(4L)) | |
| shapes[[i]] <- c(h_i, w_i) | |
| lvl_start <- c(lvl_start, cur); cur <- cur + h_i * w_i | |
| lvl_start <- integer(length(feats)) | |
| for (i in seq_along(feats)) { | |
| f <- feats[[i]] | |
| h_i <- as.integer(f$size(3L)) | |
| w_i <- as.integer(f$size(4L)) | |
| shapes[[i]] <- c(h_i, w_i) | |
| lvl_start[i] <- cur | |
| cur <- cur + h_i * w_i |
There was a problem hiding this comment.
praise this is a massive addition ( actually the largest in the code base), thanks and congratulation !
todo style multiple affectation can be turned nicer using zeallot %<-% (see exemple in line L202)
todo style we do not use multiple command per line with ";" air formatter will fix it in a single pass via $ air format R/models-lw_detr.R
suggestion performance list() preallocation should be allocated with the right size to prevent reallocation. We can manage this as a specific issue (as it is wider than your contribution).
| shortcut <- x | ||
|
|
||
| if (!self$window) { | ||
| bw <- x$size(1L); N <- x$size(2L); C <- x$size(3L) |
There was a problem hiding this comment.
suggestion You may use %<-% instead
| }, | ||
| forward = function(x, x_value = NULL) { | ||
| if (is.null(x_value)) x_value <- x | ||
| B <- x$size(1L); N <- x$size(2L); C <- x$size(3L) |
There was a problem hiding this comment.
suggestionYou may use %<-%
| self$depth <- depth | ||
| }, | ||
| forward = function(x, out_flags) { | ||
| out <- list() |
There was a problem hiding this comment.
todo perfromance please allocate the out list to its target size (to prevent the most impactfull reallocation time)
| #' norm_std <- c(0.229, 0.224, 0.225) | ||
| #' | ||
| #' # Letterbox a non-square image to 640x640 and build the matching pixel mask | ||
| #' img <- magick_loader("path/to/image.jpg") |> transform_to_tensor() |
There was a problem hiding this comment.
todo Exemple must be executable without thinking by the end user. So please choose a working image to load
suggestion you may use an image by url of your choice (or the demo one in the lw-detr repo : "https://github.com/Atten4Vis/LW-DETR/blob/main/demo/000000496954.jpg?raw=true")
| #' pred <- torch::with_no_grad( | ||
| #' model(canvas$unsqueeze(1), pixel_mask = mask$unsqueeze(1)) | ||
| #' )$detections[[1]] | ||
| #' labels <- coco_classes(as.integer(pred$labels)) |
There was a problem hiding this comment.
improvement missing Could we end the exemple with the code for a visual result through `draw_bounding_box(..) |> .. |> tensor_image_browse()
| skip_if(Sys.getenv("TEST_LARGE_MODELS", unset = 0) != 1, | ||
| "Skipping test: set TEST_LARGE_MODELS=1 to enable tests requiring large downloads.") |
There was a problem hiding this comment.
improvement We usually apply skip_if(TEST_LARGE_MODEL) to model larger than 100MB. So I would remove for the tiny model
There was a problem hiding this comment.
praise I like a lot the test for correctness at the end of the tiny pretrained model. A good practice to generalize !
todo missing Can we have the "tests for pretrained model_lw_detr_tiny" duplicated to cover "tests for pretrained model_lw_detr_small" and "tests for pretrained model_lw_detr_medium" and "tests for pretrained model_lw_detr_large" ? (all with a skip_if TEST_LARGE_MODEL != 1)
Closes #328
Implements LW-DETR (https://arxiv.org/abs/2406.03459), a real-time detection transformer
Adds model_lw_detr_tiny(), model_lw_detr_small(), model_lw_detr_medium(), and model_lw_detr_large() with COCO-pretrained weights. The architecture comprises a ViT backbone, a C2f multi-scale projector, and a two-stage deformable-attention decoder. All four variants load the official weights with exact key/shape match and reproduce reference detections; pixel_mask is supported for letterboxed (non-square) inputs.