formula-syntax • criticalESvalue

library(criticalESvalue)

In this vignette we illustrate a problem when using the critical() function with some models. In particular, all models of class htest (e.g., t.test, cor.test, etc.) do not store the dataset within the output list. On the other side, the lm function for example return a list of class lm that contains the dataset ($model).

fit_ttest <- t.test(mpg ~ am, data = mtcars)
fit_lm <- lm(Sepal.Length ~ Petal.Width, data = iris)

class(fit_ttest)
#> [1] "htest"
class(fit_lm)
#> [1] "lm"

str(fit_ttest, max.level = 1)
#> List of 10
#>  $ statistic  : Named num -3.77
#>   ..- attr(*, "names")= chr "t"
#>  $ parameter  : Named num 18.3
#>   ..- attr(*, "names")= chr "df"
#>  $ p.value    : num 0.00137
#>  $ conf.int   : num [1:2] -11.28 -3.21
#>   ..- attr(*, "conf.level")= num 0.95
#>  $ estimate   : Named num [1:2] 17.1 24.4
#>   ..- attr(*, "names")= chr [1:2] "mean in group 0" "mean in group 1"
#>  $ null.value : Named num 0
#>   ..- attr(*, "names")= chr "difference in means between group 0 and group 1"
#>  $ stderr     : num 1.92
#>  $ alternative: chr "two.sided"
#>  $ method     : chr "Welch Two Sample t-test"
#>  $ data.name  : chr "mpg by am"
#>  - attr(*, "class")= chr "htest"
head(fit_lm$model)
#>   Sepal.Length Petal.Width
#> 1          5.1         0.2
#> 2          4.9         0.2
#> 3          4.7         0.2
#> 4          4.6         0.2
#> 5          5.0         0.2
#> 6          5.4         0.4

The criticalESvalue, in particular the critical.* functions use the insigth::get_data() function to retrieve the source dataset from the environment or the output list.

This works perfectly when the object contains the dataset as in lm:

head(insight::get_data(fit_lm))
#>   Sepal.Length Petal.Width
#> 1          5.1         0.2
#> 2          4.9         0.2
#> 3          4.7         0.2
#> 4          4.6         0.2
#> 5          5.0         0.2
#> 6          5.4         0.4

However, in the htest class the function sometimes fails. In particular, when using e.g., t.test with the formula syntax y ~ x and the data = argument, insight::get_data() is not able to retrieve the dataset and return NULL. We included an error message suggesting to change the function call.

insight::get_data(fit_ttest)
#> NULL
critical(fit_ttest)
#> Error in critical.htest(fit_ttest): insight::get_data(x) returning NULL. Are you using the formula syntax (y ~ x) with the data = argument? This syntax is not supported yet. See vignette('formula-syntax', package = 'criticalESvalue'). Try to call the function without the 'data = ' argument

To fix this problem, you can simply change the function call. There are several options:

# formula without data
fit_ttest_1 <- t.test(mtcars$mpg ~ mtcars$am)

# x and y without data
fit_ttest_2 <- t.test(mtcars$mpg[mtcars$am == 0], mtcars$mpg[mtcars$am == 1]) 

# creating variables in the global environment, not recommended

x <- mtcars$mpg[mtcars$am == 0]
y <- mtcars$mpg[mtcars$am == 1]
fit_ttest_3 <- t.test(x, y)

With these options, the critical() function will not fail:

critical(fit_ttest_1)
#> 
#>  Welch Two Sample t-test
#> 
#> data:  mtcars$mpg by mtcars$am
#> t = -3.7671, df = 18.332, p-value = 0.001374
#> alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
#> 95 percent confidence interval:
#>  -11.280194  -3.209684
#> sample estimates:
#> mean in group 0 mean in group 1 
#>        17.14737        24.39231 
#> 
#> |== Effect Size and Critical Value ==| 
#> d = -1.411046 dc = ± 0.7552184 bc = ± 4.035255 
#> g = -1.352384 gc = ± 0.7238213
critical(fit_ttest_2)
#> 
#>  Welch Two Sample t-test
#> 
#> data:  mtcars$mpg[mtcars$am == 0] and mtcars$mpg[mtcars$am == 1]
#> t = -3.7671, df = 18.332, p-value = 0.001374
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#>  -11.280194  -3.209684
#> sample estimates:
#> mean of x mean of y 
#>  17.14737  24.39231 
#> 
#> |== Effect Size and Critical Value ==| 
#> d = -1.411046 dc = ± 0.7552184 bc = ± 4.035255 
#> g = -1.352384 gc = ± 0.7238213
critical(fit_ttest_3)
#> 
#>  Welch Two Sample t-test
#> 
#> data:  x and y
#> t = -3.7671, df = 18.332, p-value = 0.001374
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#>  -11.280194  -3.209684
#> sample estimates:
#> mean of x mean of y 
#>  17.14737  24.39231 
#> 
#> |== Effect Size and Critical Value ==| 
#> d = -1.411046 dc = ± 0.7552184 bc = ± 4.035255 
#> g = -1.352384 gc = ± 0.7238213

If we found a stable and reliable workaround to handle htest objects, this behaviour could change in future versions of the package.