Skip to content

Feat: Add SAHI (Slicing Aided Hyper Inference) Transform#329

Open
DerrickUnleashed wants to merge 61 commits into
mlverse:mainfrom
DerrickUnleashed:feat/sahiTransform
Open

Feat: Add SAHI (Slicing Aided Hyper Inference) Transform#329
DerrickUnleashed wants to merge 61 commits into
mlverse:mainfrom
DerrickUnleashed:feat/sahiTransform

Conversation

@DerrickUnleashed

@DerrickUnleashed DerrickUnleashed commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

This PR Adds:

  • The implementation of transform_sahi_crop()
  • The implementation of target_transform_sahi_crop()
  • Adds test suites for the same
  • Adds export command to fix this error
transforms-tensor.R:480: S3 method
  `get_image_size.torch_tensor` needs
  @export or @exportS3Method tag.

@DerrickUnleashed DerrickUnleashed marked this pull request as ready for review June 11, 2026 20:02
@DerrickUnleashed

Copy link
Copy Markdown
Contributor Author

@cregouby, I'm still a bit unclear about the usage of target_transform_sahi_crop() However, the transform_sahi_crop() seems to be working as expected

@cregouby cregouby left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise I think that the cropping values logic is there.
todo design all transform_ functions shall return a object of the same type as the input (except transform to tensor) as they will be piped. Your only freedom is to change the shape of the tensor, and move the crops into a sequential batch, as transform_five_crop() do.
ex for x$shape [1, 3, 10, 10], transform_sahi_crop(x, c(4,4) , c(.2, .2)) shall output a tensor of shape [25, 3, 3, 3] (see

test_that("five_crop", {
x <- torch_randn(3, 10, 12)
o <- transform_five_crop(x, c(3, 3))
expect_length(o, 5)
expect_tensor_shape(o[[1]], c(3,3,3))
ob <- transform_five_crop(x$unsqueeze(1), c(3, 3))
expect_length(ob, 5)
expect_tensor_shape(ob[[1]], c(1,3,3,3))
})
)
suggestion you should rely on the existing crop function to crop the tensor, saving you a lot of unit tests.
todo software design this transform shall be an S3 transform so shall have an entry in transform-generics.R for the dispatch, one in transforms-defaults.R, one in transforms-magick.R, one in transforms-tensor.R, as you never know what transform goes before, what transform is piped after.

todo code relocation Please keep (the badly named, my fault) transform-segmentation.R file for target_transforms
suggestion for clarity, you may rename the transform-segmentation.R into target-transform-segmentation.R (and the test file accordingly)

thought as SAHI intimately rely on input image size and coco bbox (so both x and y at the same time), and as there is no way to pass information from transform_sahi_crop to target_transform_sahi_crop, I think maybe a prepare_sahi_crop(dataset, size, overlap_size_ratio, ... ) function gathering all the needed context into a specific class object sahi_preparation and having both functions using it a the only input parameter transform_sahi_crop(x, sahi_preparation) and target_transform_sahi_crop(x, sahi_preparation) could be a nice API to SAHI.

Comment thread .vscode/settings.json Outdated
Comment thread R/transforms-tensor.R
Comment thread R/transforms-segmentation.R Outdated
Comment thread R/transforms-segmentation.R Outdated
@DerrickUnleashed

Copy link
Copy Markdown
Contributor Author

todo software design this transform shall be an S3 transform so shall have an entry in transform-generics.R for the dispatch, one in transforms-defaults.R, one in transforms-magick.R, one in transforms-tensor.R, as you never know what transform goes before, what transform is piped after.

I didn't understand properly, Do I need to move the code of transform_sahi and target_transform_sahi to transform-generics ?

@cregouby

cregouby commented Jun 15, 2026 via email

Copy link
Copy Markdown
Collaborator

@DerrickUnleashed

Copy link
Copy Markdown
Contributor Author
# From a dataset (SAHI as part of the transform pipeline)
ds <- coco_detection_dataset(train = FALSE, year = "2017", download = TRUE)
sp_ds <- prepare_sahi_split(ds, size = c(200, 200), overlap_size_ratio = c(0.2, 0.2))

ds <- coco_detection_dataset(train = FALSE, year = "2017",
  transform = . %>% transform_to_tensor() %>%
    transform_sahi_crop(sp_ds))

item <- ds[1]
grid <- vision_make_grid(item$x, scale = TRUE, num_rows = 3)
tensor_image_browse(grid)
file232833c1892e
img_url <- "https://raw.githubusercontent.com/obss/sahi/main/demo/demo_data/small-vehicles1.jpeg"
img <- base_loader(img_url) %>% transform_to_tensor()

sp <- prepare_sahi_split(img, size = c(512, 512))

crops <- transform_sahi_crop(img, sp)

# Synthetic target with a box straddling the first two crops
y <- list(
  boxes = torch_tensor(matrix(c(400, 100, 600, 300), nrow = 1, byrow = TRUE),
                       dtype = torch_float()),
  labels = "car"
)

targets <- target_transform_sahi_crop(y, sp, min_area_ratio = 0.1)

targets[[1]]$boxes  # Box clipped and translated to first crop coordinates

targets[[2]]$boxes  # Box in second crop (the portion that spilled over)

# Visualize the crops with bounding boxes overlaid
preview <- lapply(1:dim(crops)[1], function(i) {
  item <- list(x = crops[i, ..], y = targets[[i]])
  class(item) <- "image_with_bounding_box"
  draw_bounding_boxes(item, colors = "red")
})
grid <- vision_make_grid(torch_stack(preview), scale = FALSE, num_rows = 3)
tensor_image_browse(grid)
file2328b333a14

@DerrickUnleashed

Copy link
Copy Markdown
Contributor Author

thought the question now is how many input S3 format shall prepare_sahi_split() cover ? I'd start with the usual magick image then tensor image (single and batch of) and also cover the dataset..

it supports magick and tensor images as input now

@cregouby cregouby left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise very significant improvement toward the merge
todo see inline

Comment thread R/transforms-tensor.R Outdated
Comment thread tests/testthat/test-transforms-array.R Outdated
Comment thread R/sahi-split.R
Comment thread R/sahi-split.R Outdated
Comment thread R/sahi-split.R Outdated
Comment thread R/transforms-tensor.R
Comment thread R/sahi-split.R
Comment thread R/sahi-split.R
Comment thread R/target-transforms-segmentation.R Outdated
Comment thread R/target-transforms-segmentation.R Outdated

@cregouby cregouby left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

@DerrickUnleashed

Copy link
Copy Markdown
Contributor Author
## Not run: 
# Full SAHI pipeline: prepare split, crop image, adjust targets
img_url <- "https://raw.githubusercontent.com/obss/sahi/main/demo/demo_data/small-vehicles1.jpeg"
img <- base_loader(img_url) %>% transform_to_tensor()

sp <- prepare_sahi_split(img, size = c(512, 512), overlap_size_ratio = c(0.2, 0.2))

crops <- transform_sahi_crop(img, sp)
crops$shape

# Synthetic target with a box straddling the first two crops
y <- list(
  boxes = torch_tensor(matrix(c(400, 100, 600, 300), nrow = 1, byrow = TRUE),
                       dtype = torch_float()),
  labels = "car"
)
targets <- target_transform_sahi_crop(y, sp, min_area_ratio = 0.1)

targets[[1]]$boxes  # Box clipped and translated to first crop
targets[[2]]$boxes  # Box in second crop (the portion that spilled over)

# Visualize crops with bounding boxes overlaid
preview <- lapply(1:dim(crops)[1], function(i) {
  item <- list(x = crops[i, ..], y = targets[[i]])
  class(item) <- "image_with_bounding_box"
  draw_bounding_boxes(item, colors = "red")
})
grid <- vision_make_grid(torch_stack(preview), scale = FALSE, num_rows = 3)
tensor_image_browse(grid)

## End(Not run)
file2947226c2650

Comment thread tests/testthat/test-transforms-array.R Outdated
Comment thread tests/testthat/test-transforms-array.R Outdated
Comment thread tests/testthat/test-transforms-tensor.R
Comment thread tests/testthat/test-transforms-tensor.R Outdated

@cregouby cregouby left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise thanks for your modifications.
todo a small effort again for completness.

Comment thread tests/testthat/test-target-transforms-segmentation.R
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement SAHI: (Slicing Aided Hyper Inference) as both transform_ and target_transform_

2 participants