Equations • criticalESvalue

General

$x$ and $y$ are two samples with mean $\overline x$ and $\overline y$ and standard deviation $s_x$ and $s_y$ . In case of paired samples, $x_1$ and $x_2$ are the two vectors (e.g., pre and post).

T-Test

Two-sample t-test

Assuming equal variances

$t = \frac{\overline x - \overline y}{\text{se}}$

$\text{se} = \sqrt{s_p^2 (\frac{1}{n_x} + \frac{1}{n_y})}$ $s_p = \sqrt{\frac{s_x^2 (n_x - 1) + s_x^2 (n_y - 1)}{n_x + n_y - 2}}$ The critical $b = |\overline x - \overline y|$ is calculated as:

$b_c = t_c \text{se}$

The critical $t_c$ is calculated finding the quantile of the $t$ distribution with $\nu = n_x + n_y - 2$ degrees of freedom.

nx <- length(x)
ny <- length(y)
tc <- qt(p = alpha/2, df = nx + ny - 2)
tc * se

The critical $d$ (effect size) can be calculated as:

$d_c = \frac{d_c}{s_p}$ The standard error of $d_c$ need to be calculated using one of the available equations.

Some useful conversions:

Finding the pooled standard deviation from the standard error and sample sizes:

$s_p = \sqrt{\frac{\text{se}^2 n_x + n_y}{n_x + n_y}}$

Example converting $t$ to $d$ and the critical value:

n <- 100
x <- rnorm(n, 0.5, 1)
y <- rnorm(n, 0, 1)

fit <- t.test(x, y, var.equal = TRUE)

d <- unname(fit$statistic) * sqrt(1/n + 1/n)
c(d, effectsize::cohens_d(x, y)$Cohens_d)
## [1] 0.4414722 0.4414722

tc <- qt(0.05/2, n - 1)
abs(tc * fit$stderr) / effectsize::sd_pooled(x, y)
## [1] 0.2806107
abs(tc * sqrt(1/n + 1/n))
## [1] 0.2806107

Assuming unequal variances

$t = \frac{\overline x - \overline y}{\text{se}}$

$\text{se} = \sqrt{\frac{s_x^2}{n_x} + \frac{s_y^2}{n_y}}$

The degrees of freedom are calculated as:

$\nu \quad \approx \quad \frac{\left( \; \frac{s_x^2}{n_x} \; + \; \frac{s_y^2}{n_y} \; \right)^2 } { \quad \frac{s_1^4}{n_x^2 \nu_x} \; + \; \frac{s_2^4}{n_y^2 \nu_y} \quad }$ $\nu_{x, y} = n_{x, y} - 1$

The critical value is calculated as the previous example. The effect size is:

$d = \frac{b}{s_m}$

$s_m = \sqrt{\frac{s_x^2 + s_y^2}{2}}$

One-sample t-test

$t = \frac{\overline x - \mu_0}{\text{se}}$

$\text{se} = \frac{s_x}{\sqrt{n}}$

The critical value is calculated as usual. The degrees of freedom are $\nu = n_x - 1$ and the effect size is:

$d = \frac{\overline x - \mu_0}{s_x}$

Paired-sample t-test

$t = \frac{\mu_{\overline x_1 - \overline x_2}}{\text{se}_{\overline x_1 - \overline x_2}}$ $\text{se}_{\overline x_1 - \overline x_2} = \sqrt{\frac{s^2_{x_1} + s^2_{x_2} - 2r_{x_1x_2}s_{x_1}s_{x_2}}{n_x}}$ Again the critical value is calculated as usual. The degrees of freedom are $\nu = n_x - 1$ . The effect size can be calculated in several ways. The $d_z$ is the calculated dividing $b = \mu_{\overline x_1 - \overline x_2}$ by the standard deviation of the differences:

$d_z = \frac{\mu_{\overline x_1 - \overline x_2}}{s_{d}}$

$s_d = \sqrt{s^2_{x_1} + s^2_{x_2} - 2r_{x_1x_2}s_{x_1}s_{x_2}}$

Another version of the effect size is calculated in the same way as the two-sample version thus dividing by the pooled standard deviation:

$d_z = \frac{\mu_{\overline x_1 - \overline x_2}}{s_{p}}$ $s_p = \sqrt{\frac{s^2_{x_1} + s^2_{x_2}}{2}}$

We can convert between the two formulations:

$s_p = \frac{s_d}{\sqrt{2(1 - r)}}$ $s_d = s_p\sqrt{2(1 - r)}$ The Figure below depicts the relationship between the effect size calculated using the two methods.

Clearly the critical value will depends also on which effect size we use.

Correlation test

The cor.test implements the test statistics based on the Student $t$ distribution.

$t = \frac{r}{\text{se}_r}$

$\text{se}_r = \sqrt{\frac{1 - r^2}{n - 2}}$ The critical $r_c$ can be calculated using:

$r_c = \frac{t}{\sqrt{n - 2 + t^2}}$ The same can be done using the Fisher transformation that basically remove $r$ from the calculation of the standard error:

$F(r) = \frac{1}{2} \ln \left(\frac{1 + r}{1 - r}\right)$ $\text{se}_{F(r)} = \frac{1}{\sqrt{n - 3}}$ Then the test statistics follow a standard normal distribution:

$z = \frac{F(r) - F(\rho_0)}{\frac{1}{\sqrt{n - 3}}}$ $z = [F(r) - F(\rho_0)]\sqrt{n - 3}$

The critical value is:

$F(r_c) = \frac{z_c}{\sqrt{n - 3}}$

In R:

gen_sigma <- function(p, r){
    r + diag(1 - r, p)
}

alpha <- 0.05
n <- 30
r <- 0.5
X <- MASS::mvrnorm(n, c(0, 0), Sigma = gen_sigma(2, r), empirical = TRUE)

cor.test(X[, 1], X[, 2])
## 
##  Pearson's product-moment correlation
## 
## data:  X[, 1] and X[, 2]
## t = 3.0551, df = 28, p-value = 0.0049
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.1704314 0.7289586
## sample estimates:
## cor 
## 0.5
se_r <- sqrt((1 - r^2)/(n - 2))
r / se_r # should be the same t as cor.test
## [1] 3.05505
tc <- qt(alpha/2, n - 1) # critical t
(r_c <- abs(tc/sqrt(n - 2 + tc^2))) # critical correlation
## [1] 0.3605197
pt(r_c / sqrt((1 - r_c^2)/(n - 2)), n - 1, lower.tail = FALSE) * 2 # should be alpha
## [1] 0.05

Now using the Fisher’s $z$ approach:

zc <- abs(qnorm(alpha/2))
(fr_c <- zc / sqrt(n - 3)) # critical F(r)
## [1] 0.3771952
tanh(fr_c) # fisher to r
## [1] 0.3602692
## r to fisher
# atanh(r)
pnorm(fr_c * sqrt(n - 3), lower.tail = FALSE) * 2 # should be alpha
## [1] 0.05