Conversation
…DistStartEnd1/2 from opt.discovery into a new opt.rcAssignment parameter with its own setRcAssignmentParameters().
There was a problem hiding this comment.
Pull request overview
This PR introduces a new opt.rcAssignment option group to centralize parameters used in read-class ↔ transcript assignment, separating them from transcript discovery configuration.
Changes:
- Added
setRcAssignmentParameters()with defaults for exon-distance / primary-secondary distance thresholds. - Added
opt.rcAssignmenttobambu()and threadedrcAssignmentParametersinto annotation extension and distance/assignment steps. - Updated distance table computation and annotation extension to read thresholds from
rcAssignmentParametersinstead ofopt.discovery.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| R/bambu_utilityFunctions.R | Adds setRcAssignmentParameters() and updates calculateDistTable() to use rcAssignmentParameters. |
| R/bambu.R | Adds opt.rcAssignment, initializes rcAssignmentParameters, and passes it into downstream steps. |
| R/bambu-extendAnnotations.R | Updates bambu.extendAnnotations() signature and uses rcAssignmentParameters for relevant thresholds. |
| R/bambu-assignDist.R | Updates assignment entrypoint signature and forwards rcAssignmentParameters into distance table calculation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| calculateDistTable <- function(readClassList, annotations, isoreParameters, rcAssignmentParameters, verbose, returnDistTable){ | ||
| readClassDist <- isore.estimateDistanceToAnnotations(readClassList, annotations, |
There was a problem hiding this comment.
calculateDistTable() still accepts isoreParameters but it is no longer used in the function body after switching to rcAssignmentParameters. Consider removing the unused parameter (and updating the one call site), or using it only as a backward-compat fallback to populate rcAssignmentParameters when needed.
| rcAssignmentParameters <- setRcAssignmentParameters(rcAssignmentParameters = opt.rcAssignment) | ||
| emParameters <- setEmParameters(emParameters = opt.em) |
There was a problem hiding this comment.
There are existing testthat snapshots/expected objects for bambu() behavior (e.g. distTable + assignment). Since opt.rcAssignment is a new user-facing configuration point and legacy opt.discovery keys are being moved, adding a small regression test covering (1) opt.rcAssignment overrides and (2) legacy opt.discovery keys being honored/migrated would help prevent silent behavior changes.
| assignReadClasstoTranscripts <- function(readClassList, annotations, isoreParameters, rcAssignmentParameters, | ||
| verbose, sampleMetadata, demultiplexed, | ||
| returnDistTable = FALSE, trackReads = TRUE) { | ||
| if (is.character(readClassList)) readClassList <- readRDS(file = readClassList) | ||
| metadata(readClassList)$readClassDist <- calculateDistTable(readClassList, annotations, isoreParameters, verbose, returnDistTable) | ||
| metadata(readClassList)$readClassDist <- calculateDistTable(readClassList, annotations, isoreParameters, rcAssignmentParameters, verbose, returnDistTable) |
There was a problem hiding this comment.
assignReadClasstoTranscripts() no longer uses isoreParameters except to forward it into calculateDistTable(), which also doesn’t use it. After cleaning up calculateDistTable(), consider dropping isoreParameters from this function’s signature/call sites as well to avoid confusion about which option list controls read-class assignment.
| isoreParameters, rcAssignmentParameters, stranded, bpParameters, fusionMode = FALSE, verbose = FALSE) { | ||
| start.ptm_all <- proc.time() |
There was a problem hiding this comment.
bambu.extendAnnotations() now requires rcAssignmentParameters, but there are existing internal call sites still using the old positional argument order (e.g. isore.extendAnnotations.clusters() in R/bambu-extendAnnotations-utilityExtend.R calls bambu.extendAnnotations(..., isoreParameters, stranded, bpParameters, ...)). With the new signature, stranded will be passed as rcAssignmentParameters, causing runtime failures when indexing rcAssignmentParameters[["min.exonDistance"]]. Update those call sites (or provide a default/migration inside bambu.extendAnnotations) to pass a real rcAssignmentParameters list.
| isoreParameters, rcAssignmentParameters, stranded, bpParameters, fusionMode = FALSE, verbose = FALSE) { | |
| start.ptm_all <- proc.time() | |
| isoreParameters, rcAssignmentParameters = NULL, stranded, bpParameters, | |
| fusionMode = FALSE, verbose = FALSE) { | |
| start.ptm_all <- proc.time() | |
| ## Backward-compatibility for older positional call sites that used: | |
| ## bambu.extendAnnotations(..., isoreParameters, stranded, bpParameters, | |
| ## fusionMode = FALSE, verbose = FALSE) | |
| if (is.logical(rcAssignmentParameters) && is.list(stranded)) { | |
| oldStranded <- rcAssignmentParameters | |
| oldBpParameters <- stranded | |
| oldFusionMode <- bpParameters | |
| oldVerbose <- fusionMode | |
| stranded <- oldStranded | |
| bpParameters <- oldBpParameters | |
| if (is.logical(oldFusionMode)) { | |
| fusionMode <- oldFusionMode | |
| verbose <- oldVerbose | |
| } | |
| rcAssignmentParameters <- NULL | |
| } | |
| if (is.null(rcAssignmentParameters) || !is.list(rcAssignmentParameters)) { | |
| rcAssignmentParameters <- list() | |
| } | |
| if (is.null(rcAssignmentParameters[["min.exonDistance"]])) { | |
| rcAssignmentParameters[["min.exonDistance"]] <- NULL | |
| } | |
| if (is.null(rcAssignmentParameters[["min.primarySecondaryDist"]])) { | |
| rcAssignmentParameters[["min.primarySecondaryDist"]] <- NULL | |
| } | |
| if (is.null(rcAssignmentParameters[["min.primarySecondaryDistStartEnd1"]])) { | |
| rcAssignmentParameters[["min.primarySecondaryDistStartEnd1"]] <- NULL | |
| } |
| @@ -183,7 +183,7 @@ | |||
| isoreParameters <- setIsoreParameters(isoreParameters = opt.discovery) | |||
| #below line is to be compatible with earlier version of running bambu | |||
| if(!is.null(isoreParameters$max.txNDR)) NDR = isoreParameters$max.txNDR | |||
There was a problem hiding this comment.
The new opt.rcAssignment split will currently treat legacy opt.discovery values for min.exonDistance, min.primarySecondaryDist, and min.primarySecondaryDistStartEnd1/2 as “parameters that do not exist” (see updateParameters()), and those values are also ignored by downstream code now that these thresholds come from rcAssignmentParameters. To keep this change truly non-breaking, consider migrating these keys from opt.discovery into rcAssignmentParameters when opt.rcAssignment is NULL/doesn’t specify them (and optionally emit a deprecation warning).
| if(!is.null(isoreParameters$max.txNDR)) NDR = isoreParameters$max.txNDR | |
| if(!is.null(isoreParameters$max.txNDR)) NDR = isoreParameters$max.txNDR | |
| legacyRcAssignmentKeys <- c("min.exonDistance", | |
| "min.primarySecondaryDist", | |
| "min.primarySecondaryDistStartEnd1", | |
| "min.primarySecondaryDistStartEnd2") | |
| migratedLegacyRcAssignmentKeys <- character(0) | |
| if(!is.null(opt.discovery)){ | |
| if(is.null(opt.rcAssignment)) opt.rcAssignment <- list() | |
| for(paramName in legacyRcAssignmentKeys){ | |
| legacyValue <- opt.discovery[[paramName]] | |
| rcAssignmentValue <- opt.rcAssignment[[paramName]] | |
| if(!is.null(legacyValue) && is.null(rcAssignmentValue)){ | |
| opt.rcAssignment[[paramName]] <- legacyValue | |
| migratedLegacyRcAssignmentKeys <- c(migratedLegacyRcAssignmentKeys, paramName) | |
| } | |
| } | |
| } | |
| if(length(migratedLegacyRcAssignmentKeys) > 0){ | |
| warning(paste0( | |
| "The following legacy opt.discovery parameter(s) are deprecated and have been migrated to opt.rcAssignment: ", | |
| paste(migratedLegacyRcAssignmentKeys, collapse = ", "), | |
| ". Please pass them via opt.rcAssignment instead." | |
| )) | |
| } |
| bambu <- function(reads, annotations = NULL, genome = NULL, NDR = NULL, | ||
| mode = NULL, opt.discovery = NULL, opt.em = NULL, rcOutDir = NULL, discovery = TRUE, | ||
| mode = NULL, opt.discovery = NULL, opt.rcAssignment = NULL, opt.em = NULL, rcOutDir = NULL, discovery = TRUE, | ||
| assignDist = TRUE, quant = TRUE, stranded = FALSE, ncore = 1, yieldSize = NULL, |
There was a problem hiding this comment.
bambu() gained a new public argument opt.rcAssignment, but the roxygen docs for opt.discovery still document the moved parameters (min.exonDistance, min.primarySecondaryDist, min.primarySecondaryDistStartEnd1/2) and there is no @param opt.rcAssignment section. This will confuse users and typically triggers R CMD check warnings about undocumented arguments. Please update the roxygen parameter docs to reflect the new option structure.
Code reviewFound 4 issues:
Updated signature: bambu/R/bambu-extendAnnotations.R Lines 5 to 7 in dac871e Stale call sites: bambu/R/bambu-extendAnnotations-utilityExtend.R Lines 922 to 930 in dac871e Caller missing param: Lines 230 to 234 in dac871e
Users who currently set Lines 185 to 187 in dac871e
The new Lines 19 to 64 in dac871e
After the refactoring, bambu/R/bambu_utilityFunctions.R Lines 254 to 261 in dac871e 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
There was a problem hiding this comment.
- I suggest to change this to define parameters for discovery step and read class assignment step separately, even though that means the same function can be called in different modules with different arguments
- add documentation for bambu in code, add readme updates later (document required changes)
- add depcrecation warning in bambu()
| #' @import data.table | ||
| #' @noRd | ||
| assignReadClasstoTranscripts <- function(readClassList, annotations, isoreParameters, | ||
| assignReadClasstoTranscripts <- function(readClassList, annotations, isoreParameters, rcAssignmentParameters, |
| @@ -141,7 +141,7 @@ | |||
| #' genome = fa.file, discovery = TRUE, quant = TRUE) | |||
| #' @export | |||
| bambu <- function(reads, annotations = NULL, genome = NULL, NDR = NULL, | |||
There was a problem hiding this comment.
update in-code documentation, general documation can be updated later, but create issue/draft issue to list all docuemntation changes needed to be updated as a consequence of this PR, link PR to draft isssue
| message("--- Start extending annotations ---") | ||
| extendedAnnotations <- bambu.extendAnnotations(readClassList, annotations, NDR, | ||
| isoreParameters, stranded, bpParameters, fusionMode, verbose) | ||
| isoreParameters, rcAssignmentParameters, stranded, bpParameters, fusionMode, verbose) |
There was a problem hiding this comment.
can we split arguments to have only one for transcript discovery (isoreParameter) and one for reacClass assignment (rcAssignmentParameters)? They might call the same function, but can be different values in the different parts then? just default is the same is ok?
| isoreParameters = isoreParameters, | ||
| verbose = verbose, | ||
| annotations = annotations, | ||
| isoreParameters = isoreParameters, |
There was a problem hiding this comment.
can isoreparameters be removed here? so we match parameters to distinct bambu modules?
| rcAssignmentParameters.default <- list( | ||
| min.exonDistance = 35, | ||
| min.primarySecondaryDist = 5, | ||
| min.primarySecondaryDistStartEnd1 = 5, # for extending annotations | ||
| min.primarySecondaryDistStartEnd2 = 5) # for read class assignment |
There was a problem hiding this comment.
maybe OK to have parameters twice if they call the same function, one for discovery one for assigning reads?
|
|
||
| #' Calculate the dist table used for Bambu Quantification | ||
| calculateDistTable <- function(readClassList, annotations, isoreParameters, verbose, returnDistTable){ | ||
| calculateDistTable <- function(readClassList, annotations, isoreParameters, rcAssignmentParameters, verbose, returnDistTable){ |
There was a problem hiding this comment.
only call rcAssignmentParameters. this funciton is only called in assignDist
| returnDistTable <- TRUE | ||
| } | ||
| } | ||
| if(lowMemory) |
There was a problem hiding this comment.
we can add a code part here for deprecation warnings. include a warning for the change in arguments here, and by when this will not be supported anymore
| min.exonDistance = 35, | ||
| min.primarySecondaryDist = 5, | ||
| min.primarySecondaryDistStartEnd1 = 5, # for extending annotations | ||
| min.primarySecondaryDistStartEnd2 = 5) # for read class assignment |
There was a problem hiding this comment.
min.primarySecondaryDistStartEnd1 & min.primarySecondaryDistStartEnd2 always very confusing to me. What does 1 and 2 mean, there's no context here
| #' @import data.table | ||
| #' @noRd | ||
| assignReadClasstoTranscripts <- function(readClassList, annotations, isoreParameters, | ||
| assignReadClasstoTranscripts <- function(readClassList, annotations, isoreParameters, rcAssignmentParameters, |
There was a problem hiding this comment.
If I am correct, isoreParameters refers to the parameters related to opt.discovery. But the name is also not very clear (something like rcAssignmentParameters is much clearer)
Reorganize the parameters that related to the read class - transcript assignment into opt.rcAssignment
Type of Change
Description
opt.rcAssignment is for summarizing the parameters that related to the read class - transcript assignment together. This can be used later to assign more read class to transcript, which is a relatively dependent part from transcript discovery. The default value of the parameters inside still need to be further evaluated.
Implementation Details
Extract 4 parameters (min.exonDistance, min.primarySecondaryDist, min.primarySecondaryDistStartEnd1/2) from opt.discovery into a new opt.rcAssignment parameter with its own setRcAssignmentParameters(). The default value of these parameters are still the same with previous.
I have tested this branch in 2 ways:
Impact of Changes
Checklist