Skip to content

Commit e7efa4c

Browse files
authored
Merge pull request #136 from kgoldfeld/fix-issue-135
Fixing rounding bug in .gencat
2 parents 44a187e + d6ae361 commit e7efa4c

3 files changed

Lines changed: 22 additions & 19 deletions

File tree

R/generate_dist.R

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@
4444
formula = args$formula,
4545
variance = args$variance,
4646
link = args$link,
47-
dfSim = copy(dfSim),
47+
dtSim = copy(dfSim),
4848
envir = envir
4949
),
5050
exponential = .genexp(
@@ -270,7 +270,7 @@
270270
# @param envir Environment the data definitions are evaluated in.
271271
# Defaults to [base::parent.frame].
272272
# @return A data.frame column with the updated simulated data
273-
.gencat <- function(n, formula, variance, link, dfSim, envir) {
273+
.gencat <- function(n, formula, variance, link, dtSim, envir) {
274274
formulas <- .splitFormula(formula)
275275

276276
if (length(formulas) < 2) {
@@ -281,7 +281,7 @@
281281
}
282282

283283
parsedProbs <-
284-
.evalWith(formulas, .parseDotVars(formulas, envir), dfSim, n)
284+
.evalWith(formulas, .parseDotVars(formulas, envir), dtSim, n)
285285

286286
if (link == "logit") {
287287
parsedProbs <- exp(parsedProbs)
@@ -291,6 +291,7 @@
291291
}
292292

293293
parsedProbs <- cbind(parsedProbs, 1 - rowSums(parsedProbs))
294+
parsedProbs <- round(parsedProbs, 12) # to avoid extremely small p's
294295

295296
c <- .Call(`_simstudy_matMultinom`, parsedProbs, PACKAGE = "simstudy")
296297

man/genOrdCat.Rd

Lines changed: 17 additions & 15 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

vignettes/simstudy.Rmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -217,7 +217,7 @@ A *binomial* distribution is a discrete data distribution that represents the co
217217

218218
#### categorical
219219

220-
A *categorical* distribution is a discrete data distribution taking on values from $1$ to $K$, with each value representing a specific category, and there are $K$ categories. The categories may or may not be ordered. For a categorical variable with $k$ categories, the `formula` is a string of probabilities that sum to 1, each separated by a semi-colon: $(p_1 ; p_2 ; ... ; p_k)$. $p_1$ is the probability of the random variable falling in category $1$, $p_2$ is the probability of category $2$, etc. The probabilities can be specified as functions of other variables previously defined. The helper function `genCatFormula` is an easy way to create different probability strings. The `link` options are *identity* or *logit*. The `variance` field is optional an allows to provide categories other than the default `1...n` in the same format as `formula`: "a;b;c". Numeric variance Strings (e.g. "50;100;200") will be converted to numeric when possible.
220+
A *categorical* distribution is a discrete data distribution taking on values from $1$ to $K$, with each value representing a specific category, and there are $K$ categories. The categories may or may not be ordered. For a categorical variable with $k$ categories, the `formula` is a string of probabilities that sum to 1, each separated by a semi-colon: $(p_1 ; p_2 ; ... ; p_k)$. $p_1$ is the probability of the random variable falling in category $1$, $p_2$ is the probability of category $2$, etc. The probabilities can be specified as functions of other variables previously defined. The helper function `genCatFormula` is an easy way to create different probability strings. The `link` options are *identity* or *logit*. The `variance` field is optional an allows to provide categories other than the default `1...n` in the same format as `formula`: "a;b;c". Numeric variance Strings (e.g. "50;100;200") will be converted to numeric when possible. All probabilities will be rounded to 1e12 decimal points to prevent possible rounding errors.
221221

222222
#### exponential
223223

0 commit comments

Comments
 (0)