Power Analysis on Chicks Numberical Abilities

Psicostat Workshops

Authors

Kimberly Brosche

Filippo Gambarota

Enrico Toffalini

Gianmarco Altoè

Rosa Rugani

Published

March 7, 2024

library(tidyverse)
library(lme4)
library(effects)
library(effectsize)
library(emmeans)
library(kableExtra)
library(here)
library(ggeffects)
simfiles <- list.files(here("objects"), include.dirs = FALSE, pattern = "_simres", full.names = TRUE)
simfile <- simfiles[length(simfiles)] # reading the last one
sim <- readRDS(simfile)
simres <- sim |> 
    unnest(res) |> 
    group_by(ns, nt, sb0, N, model) |> 
    nest()

General idea

The general idea was to estimate the statistical power for an experiment with numerical abilities in 3 different breeds of chicks. The experiment is organized as follows:

  • 3 different breed of chicks
  • 5 different comparison conditions

According to the available resources, we realized that we can collect a maximum of 20 chicks per breed (60 in total) and we can run 4 trials per chick per condition (20 in total).

The dependent variable is the binary (0 or 1) choice for each trial. For this reason we decided to use a mixed-effects logistic regression with the chick as random-intercept and the interaction between breed and comparison as fixed-effects.

Which model would you use here?

A little spoiler, this is a realistic dataset for this experiment. Which model do you consider appropriate? Which model would you use?

id breed comparison nt nc nf acc
1 ross308 2vs3 4 2 2 0.5
1 ross308 3vs4 4 0 4 0
1 ross308 5vs10 4 1 3 0.25
1 ross308 3vs6 4 3 1 0.75
1 ross308 6vs9 4 2 2 0.5
2 ross308 2vs3 4 3 1 0.75
2 ross308 3vs4 4 4 0 1
... ... ... ... ... ... ...
59 gallo_bankiva 3vs6 4 3 1 0.75
59 gallo_bankiva 6vs9 4 3 1 0.75
60 gallo_bankiva 2vs3 4 3 1 0.75
60 gallo_bankiva 3vs4 4 2 2 0.5
60 gallo_bankiva 5vs10 4 3 1 0.75
60 gallo_bankiva 3vs6 4 3 1 0.75
60 gallo_bankiva 6vs9 4 4 0 1

How to choose the effect sizes?

Compared to the previous workshops, here we have a complex 3b x 5w (b for between and w for within) design. There are two effects of interest:

  • the interaction breed:comparison. This means that we expect that the difference between two breeds is different between conditions. Clearly, compared to a simple 2x2 interaction here we have multiple difference of differences that we can evaluate.
  • the main effect breed where more domesticated breeds should be less accurate compared to less domesticated breeds. In this case, we have 3 possible comparisons that we can make.

We decided to work on the expected probability for each cell compared to fixing an overall effect size or other approaches. For this reason we asked to find plausible values for each cell according to the literature or previous studies.

mtab(pivot_wider(hyp_table, names_from = breed, values_from = p))
comparison ross308 gallina_padovana gallo_bankiva
2vs3 0.56 0.67 0.725
3vs4 0.50 0.50 0.500
5vs10 0.62 0.73 0.785
3vs6 0.65 0.82 0.875
6vs9 0.55 0.66 0.705

The pattern can be seen also within a plot:

hyp_table$comparison <- factor(hyp_table$comparison, levels = c("3vs4", "6vs9", "2vs3", "5vs10", "3vs6"))
hyp_table |> 
    ggplot(aes(x = comparison, y = p, color = breed, group = breed)) +
    geom_point(size = 3) +
    geom_line() +
    ylab("Probability") +
    xlab("Comparison")

The ross308 is the more domesticated breed and the gallo_bankiva is the less domesticated breed. The 3vs4 conditions is considered as a baseline condition because each breed is assumed to perform at chance.

Simulating the fixed-effects

To simulate data from a model we need to write the model equation (i.e., the linear combination of predictors and coefficients). For simple model this is straightforward and also very educational. For very complex models this can be inefficient and confusing. For thi reason we decided to work with the matrix formulation of the linear model.

Let’s see an example with a 2x2 design (no logistic, no mixed-effects, just to show the idea):

Code Block 1: Linear model with manual formula
N <- 100 # total sample size
x1 <- c(0, 1) # x1 variable
x2 <- c(0, 1) # x2 variable

# parameters
b0 <- 0 # intercept
b1 <- 0.5
b2 <- 0.1
b3 <- 0.3 # interaction

# using model equation

dat <- data.frame(id = 1:N, x1 = rep(x1, each = N/2), x2 = rep(x2, N/2))
dat$lp <- with(dat, b0 + b1 * x1 + b2 * x2 + b3 * x1 * x2)
head(dat)
  id x1 x2  lp
1  1  0  0 0.0
2  2  0  1 0.1
3  3  0  0 0.0
4  4  0  1 0.1
5  5  0  0 0.0
6  6  0  1 0.1
ggplot(dat, aes(x = factor(x1), color = factor(x2), y = lp)) +
    geom_point(size = 4) +
    geom_line(aes(group = factor(x2))) +
    ylab("Linear Predictor") +
    xlab("x1") +
    labs(color = "x2")

The lp is the true value of the dependent variable (i.e., without error) for each subject/condition.

We can obtain the same result using the matrix formulation of the linear model as seen in Equation 1.

\[ \mathbf{Y} = \mathbf{X}\boldsymbol{\beta} \tag{1}\]

Thus the linear predictors can be obtained from the multiplication (%*% for matrix multiplication) between the model matrix and the vector of coefficients.

Code Block 2: Linear model with matrix multiplication
X <- model.matrix(~ x1 * x2, data = dat)
head(X)
##   (Intercept) x1 x2 x1:x2
## 1           1  0  0     0
## 2           1  0  1     0
## 3           1  0  0     0
## 4           1  0  1     0
## 5           1  0  0     0
## 6           1  0  1     0
Code Block 3: Linear model with matrix multiplication
B <- c(b0, b1, b2, b3)
lp <- c(X %*% B) # c() to obtain a vector
all(lp == dat$lp)
## [1] TRUE

The great aspect is that regardless the model complexity the multiplication in the Code Block 3 is the same. On the other side the code in Code Block 1 will grow in complexity.

The problem is that defining the \(\boldsymbol{\beta}\) vector is not easy for a complex design. The trick here is to use the inverse formula (see Equation 2) and fixing \(\mathbf{Y}\) (i.e., the probabilities assumed in the cells), the model matrix \(\mathbf{X}\) finding the \(\boldsymbol{\beta}\) vector. In this way, regardless the contrast coding in \(\mathbf{X}\), we still have a valid set of model coefficients.

\[ \boldsymbol{\beta} = (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{Y} \tag{2}\]

Still using our previous example we obtain the vector of coefficients using the inverse formula.

B_new <- solve(t(X) %*% X) %*% t(X) %*% lp
data.frame(B, B_new)
              B B_new
(Intercept) 0.0   0.0
x1          0.5   0.5
x2          0.1   0.1
x1:x2       0.3   0.3

For the chicks case, there is another source of complexity related to using a logistic regression. Without going into details, this approach for the fixed part works only in the logit (i.e., link function) space, thus we need to transform the probabilities into logit.

lp <- qlogis(hyp_table$p) # vector of probabilities in logit
X <- model.matrix(~comparison * breed, data = hyp_table, contrasts.arg = list(comparison = contr.sum(5), breed = contr.sum(3))) # setting sum to 0 contrasts, not mandatory but highly suggested
head(X)
  (Intercept) comparison1 comparison2 comparison3 comparison4 breed1 breed2
1           1           0           0           1           0      1      0
2           1           0           0           1           0      0      1
3           1           0           0           1           0     -1     -1
4           1           1           0           0           0      1      0
5           1           1           0           0           0      0      1
6           1           1           0           0           0     -1     -1
  comparison1:breed1 comparison2:breed1 comparison3:breed1 comparison4:breed1
1                  0                  0                  1                  0
2                  0                  0                  0                  0
3                  0                  0                 -1                  0
4                  1                  0                  0                  0
5                  0                  0                  0                  0
6                 -1                  0                  0                  0
  comparison1:breed2 comparison2:breed2 comparison3:breed2 comparison4:breed2
1                  0                  0                  0                  0
2                  0                  0                  1                  0
3                  0                  0                 -1                  0
4                  0                  0                  0                  0
5                  1                  0                  0                  0
6                 -1                  0                  0                  0
B <- solve(t(X) %*% X) %*% t(X) %*% lp
B
                           [,1]
(Intercept)         0.700963225
comparison1        -0.700963225
comparison2        -0.122567438
comparison3        -0.061380667
comparison4         0.225442272
breed1             -0.390879187
breed2              0.075526643
comparison1:breed1  0.390879187
comparison2:breed1  0.013154096
comparison3:breed1 -0.007541313
comparison4:breed1 -0.045978084
comparison1:breed2 -0.075526643
comparison2:breed2  0.009371788
comparison3:breed2 -0.006924143
comparison4:breed2 -0.007309565

B is our vector of parameters. We don’t care about how to interpret each parameter because it depends on the contrast coding but we know that this parameters will reconstruct the vector of probabilities.

all.equal(plogis(c(X %*% B)), hyp_table$p)
[1] TRUE

Simulating the random-effects

Choosing the standard deviation for the random-intercept \(\sigma_{\beta_0}\) is not easy. We can try different values and see the impact. The random-effects model has a different formulation. \(\boldsymbol{\delta}\) is the vector of the by-chick adjustment to the global intercept. The interpretation is that each chick will have a true probability of success that can be lower or higher compared to the grand-mean.

\[ \mathbf{Y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\delta} \]

To show the impact of the random effect we can simulate several values centered on \(p = 0.5\) and see the impact on the variability:

ns <- 20 # number of chicks in each breed
nt <- 4  # number of trials in each condition
N <- ns * length(unique(hyp_table$breed))
sb0 <- c(0.1, 0.5, 1, 2)
b0i <- lapply(sb0, function(s) rnorm(1e3, 0, s))

par(mfrow = c(1, length(sb0)))
for(i in 1:length(sb0)) hist(plogis(qlogis(0.5) + b0i[[i]]), breaks = 30, col = "dodgerblue", main = latex2exp::TeX(sprintf("$\\sigma_{\\beta_0} = %s$", sb0[i])), xlab = "p", cex.main = 2, cex.lab = 1.5, cex.axis = 1.5, xlim = c(0, 1))

We can choose 0.5 as standard deviation and simulate some data:

ns <- 1e3 # number of chicks in each breed
nt <- 4  # number of trials in each condition
N <- ns * length(unique(hyp_table$breed))
between <- data.frame(
    id = 1:N,
    breed = rep(unique(hyp_table$breed), each = ns)
)
exdata <- expand_grid(between, comparison = unique(hyp_table$comparison))
exdata$nt <- nt
head(exdata)
# A tibble: 6 × 4
     id breed   comparison    nt
  <int> <fct>   <fct>      <dbl>
1     1 ross308 2vs3           4
2     1 ross308 3vs4           4
3     1 ross308 5vs10          4
4     1 ross308 3vs6           4
5     1 ross308 6vs9           4
6     2 ross308 2vs3           4
X <- model.matrix(~ comparison * breed, 
                  data = exdata, 
                  contrasts.arg = list(comparison = contr.sum(5), 
                                       breed = contr.sum(3)))
sb0 <- 0.5 # standard deviation random intercepts
delta <- rnorm(N, 0, sb0)[exdata$id] # random effect
lp_fixed <- X %*% B
exdata$p <- c(plogis(lp_fixed + delta)) # true probability for each chick/condition
Code
hyp_table$lp <- qlogis(hyp_table$p)
hyp_table_l <- pivot_longer(hyp_table, c(p, lp))
hyp_table_l <- mutate(hyp_table_l, name = ifelse(name == "p", "Probability", "Logit"))

exdata |> 
    slice_sample(n = 1000) |> 
    mutate(lp = qlogis(p)) |> 
    pivot_longer(c(p, lp)) |> 
    mutate(name = ifelse(name == "p", "Probability", "Logit")) |> 
    ggplot(aes(x = comparison, y = value, color = breed)) +
    geom_point(position = position_jitterdodge(jitter.width = 0.15, dodge.width = 0.3),
               alpha = 0.3) +
    geom_point(data = hyp_table_l, 
               aes(x = comparison, y = value, color = breed),
               size = 5,
               position = position_dodge(width = 0.3)) +
    geom_line(data = hyp_table_l,
              aes(x = comparison, y = value, color = breed, group = breed),
              position = position_dodge(width = 0.3)) +
    facet_wrap(~name, scales = "free", ncol = 1) +
    theme(axis.title.y = element_blank(),
          legend.title = element_blank())

Simulating the error

So far we simulated (or better calculated) the linear predictor i.e. the true values for each chick/condition. To simulate the random error of the model we need to sample from the appropriate probability distribution. Given that we have a binary response variable, we can simulate from a Binomial distribution.

Let’s simulate again an example dataset with the correct structure:

ns <- 20 # number of chicks in each breed
N <- ns * length(unique(hyp_table$breed))
nt <- 4  # number of trials in each condition
between <- data.frame(
    id = 1:N,
    breed = rep(unique(hyp_table$breed), each = ns)
)
exdata <- expand_grid(between, comparison = unique(hyp_table$comparison))
exdata$nt <- nt
head(exdata)
# A tibble: 6 × 4
     id breed   comparison    nt
  <int> <fct>   <fct>      <dbl>
1     1 ross308 2vs3           4
2     1 ross308 3vs4           4
3     1 ross308 5vs10          4
4     1 ross308 3vs6           4
5     1 ross308 6vs9           4
6     2 ross308 2vs3           4
contrasts(exdata$comparison) <- contr.sum(length(levels(exdata$comparison)))
contrasts(exdata$breed) <- contr.sum(length(levels(exdata$breed)))

X <- model.matrix(~ comparison * breed, data = exdata)
sb0 <- 0.5 # standard deviation random intercepts
delta <- rnorm(N, 0, sb0)[exdata$id] # random effect
lp_fixed <- X %*% B
exdata$p <- c(plogis(lp_fixed + delta)) # true probability for each chick/condition
head(exdata)
# A tibble: 6 × 5
     id breed   comparison    nt     p
  <int> <fct>   <fct>      <dbl> <dbl>
1     1 ross308 2vs3           4 0.670
2     1 ross308 3vs4           4 0.615
3     1 ross308 5vs10          4 0.723
4     1 ross308 3vs6           4 0.748
5     1 ross308 6vs9           4 0.661
6     2 ross308 2vs3           4 0.549

We can use the rbinom() function where n is the number of samples, size is the number of trials and prob is the vector of true probabilities.

Important

Note that this simulate aggregated data (number of successes instead of the 0-1 values). This is equivalent to simulate the binary variable but the model fit will be faster. If you don’t have predictors at the level of the trial (e.g., reaction times for each trial) you can use aggregated data otherwise you must use binary data.

exdata$nc <- rbinom(n = nrow(exdata), size = exdata$nt, prob = exdata$p) # number of successes
exdata$nf <- exdata$nt - exdata$nc # number of fails

head(exdata)
# A tibble: 6 × 7
     id breed   comparison    nt     p    nc    nf
  <int> <fct>   <fct>      <dbl> <dbl> <int> <dbl>
1     1 ross308 2vs3           4 0.670     3     1
2     1 ross308 3vs4           4 0.615     1     3
3     1 ross308 5vs10          4 0.723     1     3
4     1 ross308 3vs6           4 0.748     3     1
5     1 ross308 6vs9           4 0.661     3     1
6     2 ross308 2vs3           4 0.549     3     1

Model fitting with lme4::glmer()

Finally we can fit the model using lme4::glmer(). Note that we write cbind(nc, nf) ~ . that is the way to specify a logistic regression in the aggregated way.

fit <- glmer(cbind(nc, nf) ~ comparison * breed + (1|id), 
             data = exdata, 
             family = binomial(link = "logit"), glmerControl(optimizer = "bobyqa"))
summary(fit)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: cbind(nc, nf) ~ comparison * breed + (1 | id)
   Data: exdata
Control: glmerControl(optimizer = "bobyqa")

     AIC      BIC   logLik deviance df.resid 
   838.7    897.9   -403.3    806.7      284 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.6953 -0.6482  0.0063  0.7809  2.1319 

Random effects:
 Groups Name        Variance Std.Dev.
 id     (Intercept) 0.2677   0.5174  
Number of obs: 300, groups:  id, 60

Fixed effects:
                   Estimate Std. Error z value Pr(>|z|)    
(Intercept)         0.62662    0.09331   6.715 1.88e-11 ***
comparison1        -0.60895    0.12228  -4.980 6.36e-07 ***
comparison2         0.10345    0.12890   0.803   0.4222    
comparison3        -0.14698    0.12473  -1.178   0.2386    
comparison4         0.08456    0.12723   0.665   0.5063    
breed1             -0.33338    0.12974  -2.570   0.0102 *  
breed2             -0.14933    0.13043  -1.145   0.2523    
comparison1:breed1  0.15108    0.17153   0.881   0.3784    
comparison2:breed1  0.09383    0.17785   0.528   0.5978    
comparison3:breed1  0.01301    0.17344   0.075   0.9402    
comparison4:breed1  0.11272    0.17664   0.638   0.5234    
comparison1:breed2  0.02702    0.17133   0.158   0.8747    
comparison2:breed2 -0.26182    0.17682  -1.481   0.1387    
comparison3:breed2 -0.01138    0.17377  -0.065   0.9478    
comparison4:breed2 -0.07948    0.17641  -0.451   0.6523    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We have a lot of parameters thus the best method is always plotting the effects:

Code
eff <- data.frame(ggeffect(fit, terms = ~ breed * comparison))
ggplot(eff, aes(x = group, y = predicted, color = x)) +
    geom_pointrange(aes(ymin = conf.low, ymax = conf.high),
                    position = position_dodge(width = 0.3)) +
    ggtitle("breed x comparison") + 
    theme(axis.title.x = element_blank(),
          legend.title = element_blank()) +
    ylab("Probability")

A little bit of variability

Let’s imagine to repeat the simulation let’s say 3 times and see the results:

Code
res <- vector(mode = "list", 3)
for(i in 1:length(res)){
    delta <- rnorm(N, 0, sb0)[exdata$id] # random effect
    lp_fixed <- X %*% B
    exdata$p <- c(plogis(lp_fixed + delta)) # true probability for each chick/condition
    exdata$nc <- rbinom(n = nrow(exdata), size = exdata$nt, prob = exdata$p) # number of successes
    exdata$nf <- exdata$nt - exdata$nc # number of fails
    fit <- glmer(cbind(nc, nf) ~ comparison * breed + (1|id), 
             data = exdata, 
             family = binomial(link = "logit"), glmerControl(optimizer = "bobyqa"))
    res[[i]] <- data.frame(ggeffect(fit, terms = ~ breed * comparison))
}

bind_rows(res, .id = "nsim") |> 
    ggplot(aes(x = group, y = predicted, color = x)) +
    geom_point(position = position_jitter(width = 0.1), size = 3)

Interaction and post-hoc

The interaction can be assessed using a likelihood ratio test comparing the model with and without interaction. For generalized linear models this can be done using car::Anova():

car::Anova(fit)
Analysis of Deviance Table (Type II Wald chisquare tests)

Response: cbind(nc, nf)
                   Chisq Df Pr(>Chisq)    
comparison       38.5628  4  8.576e-08 ***
breed            14.0347  2  0.0008962 ***
comparison:breed  9.4396  8  0.3065798    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Then we can see the actual contrasts of the interaction using emmeans()

In this way we see the differences between breed within each condition:

emmeans(fit, pairwise ~ breed|comparison)$contrast
comparison = 3vs4:
 contrast                         estimate    SE  df z.ratio p.value
 ross308 - gallina_padovana         -0.433 0.369 Inf  -1.173  0.4693
 ross308 - gallo_bankiva            -0.110 0.369 Inf  -0.297  0.9525
 gallina_padovana - gallo_bankiva    0.324 0.370 Inf   0.876  0.6556

comparison = 6vs9:
 contrast                         estimate    SE  df z.ratio p.value
 ross308 - gallina_padovana         -0.776 0.376 Inf  -2.065  0.0973
 ross308 - gallo_bankiva            -0.965 0.382 Inf  -2.527  0.0309
 gallina_padovana - gallo_bankiva   -0.188 0.389 Inf  -0.484  0.8788

comparison = 2vs3:
 contrast                         estimate    SE  df z.ratio p.value
 ross308 - gallina_padovana         -0.737 0.379 Inf  -1.944  0.1266
 ross308 - gallo_bankiva            -0.503 0.374 Inf  -1.342  0.3716
 gallina_padovana - gallo_bankiva    0.235 0.385 Inf   0.610  0.8149

comparison = 5vs10:
 contrast                         estimate    SE  df z.ratio p.value
 ross308 - gallina_padovana         -0.746 0.381 Inf  -1.956  0.1233
 ross308 - gallo_bankiva            -0.881 0.386 Inf  -2.281  0.0585
 gallina_padovana - gallo_bankiva   -0.135 0.398 Inf  -0.340  0.9382

comparison = 3vs6:
 contrast                         estimate    SE  df z.ratio p.value
 ross308 - gallina_padovana         -1.098 0.419 Inf  -2.620  0.0239
 ross308 - gallo_bankiva            -1.644 0.464 Inf  -3.542  0.0012
 gallina_padovana - gallo_bankiva   -0.546 0.498 Inf  -1.096  0.5164

Results are given on the log odds ratio (not the response) scale. 
P value adjustment: tukey method for comparing a family of 3 estimates 

In this way we see the difference between conditions within each breed:

emmeans(fit, pairwise ~ comparison|breed)$contrast
breed = ross308:
 contrast     estimate    SE  df z.ratio p.value
 3vs4 - 6vs9   -0.1068 0.327 Inf  -0.327  0.9975
 3vs4 - 2vs3   -0.2669 0.327 Inf  -0.816  0.9258
 3vs4 - 5vs10  -0.3205 0.327 Inf  -0.979  0.8648
 3vs4 - 3vs6   -0.7052 0.332 Inf  -2.121  0.2109
 6vs9 - 2vs3   -0.1601 0.327 Inf  -0.490  0.9883
 6vs9 - 5vs10  -0.2137 0.327 Inf  -0.653  0.9660
 6vs9 - 3vs6   -0.5984 0.332 Inf  -1.802  0.3722
 2vs3 - 5vs10  -0.0536 0.327 Inf  -0.164  0.9998
 2vs3 - 3vs6   -0.4383 0.332 Inf  -1.320  0.6789
 5vs10 - 3vs6  -0.3847 0.332 Inf  -1.158  0.7755

breed = gallina_padovana:
 contrast     estimate    SE  df z.ratio p.value
 3vs4 - 6vs9   -0.4500 0.336 Inf  -1.338  0.6674
 3vs4 - 2vs3   -0.5707 0.340 Inf  -1.680  0.4464
 3vs4 - 5vs10  -0.6329 0.342 Inf  -1.852  0.3435
 3vs4 - 3vs6   -1.3703 0.380 Inf  -3.611  0.0028
 6vs9 - 2vs3   -0.1207 0.347 Inf  -0.348  0.9969
 6vs9 - 5vs10  -0.1830 0.349 Inf  -0.525  0.9849
 6vs9 - 3vs6   -0.9203 0.386 Inf  -2.387  0.1187
 2vs3 - 5vs10  -0.0622 0.352 Inf  -0.177  0.9998
 2vs3 - 3vs6   -0.7996 0.388 Inf  -2.059  0.2381
 5vs10 - 3vs6  -0.7374 0.390 Inf  -1.891  0.3222

breed = gallo_bankiva:
 contrast     estimate    SE  df z.ratio p.value
 3vs4 - 6vs9   -0.9620 0.343 Inf  -2.808  0.0400
 3vs4 - 2vs3   -0.6599 0.334 Inf  -1.975  0.2785
 3vs4 - 5vs10  -1.0919 0.347 Inf  -3.143  0.0144
 3vs4 - 3vs6   -2.2400 0.429 Inf  -5.218  <.0001
 6vs9 - 2vs3    0.3021 0.348 Inf   0.869  0.9083
 6vs9 - 5vs10  -0.1299 0.360 Inf  -0.361  0.9964
 6vs9 - 3vs6   -1.2779 0.438 Inf  -2.917  0.0292
 2vs3 - 5vs10  -0.4320 0.352 Inf  -1.226  0.7360
 2vs3 - 3vs6   -1.5800 0.432 Inf  -3.654  0.0024
 5vs10 - 3vs6  -1.1480 0.442 Inf  -2.600  0.0704

Results are given on the log odds ratio (not the response) scale. 
P value adjustment: tukey method for comparing a family of 5 estimates 

We have a lot of tests but at the same time we have a clear hypothesis that “ross < gallina < gallo” thus we can test directly this hypothesis:

emm <- emmeans(fit, pairwise ~ breed|comparison)$contrast
contr <- list(
    "padovana_vs_ross" = c(-1, 1, 0),
    "gallo_vs_padovana" = c(0, -1, 1)
)
emmeans::contrast(emm, method = contr)
comparison = 3vs4:
 contrast          estimate    SE  df z.ratio p.value
 padovana_vs_ross     0.324 0.370 Inf   0.876  0.3811
 gallo_vs_padovana    0.433 0.369 Inf   1.173  0.2408

comparison = 6vs9:
 contrast          estimate    SE  df z.ratio p.value
 padovana_vs_ross    -0.188 0.389 Inf  -0.484  0.6283
 gallo_vs_padovana    0.776 0.376 Inf   2.065  0.0390

comparison = 2vs3:
 contrast          estimate    SE  df z.ratio p.value
 padovana_vs_ross     0.235 0.385 Inf   0.610  0.5422
 gallo_vs_padovana    0.737 0.379 Inf   1.944  0.0519

comparison = 5vs10:
 contrast          estimate    SE  df z.ratio p.value
 padovana_vs_ross    -0.135 0.398 Inf  -0.340  0.7338
 gallo_vs_padovana    0.746 0.381 Inf   1.956  0.0505

comparison = 3vs6:
 contrast          estimate    SE  df z.ratio p.value
 padovana_vs_ross    -0.546 0.498 Inf  -1.096  0.2730
 gallo_vs_padovana    1.098 0.419 Inf   2.620  0.0088

Results are given on the log odds ratio (not the response) scale. 

Another option could be to compare firstly domesticated vs non-domesticated breeds and then comparing the two non-domesticated:

contr <- list("domesticated_vs_nondomesticated" = c(1, -0.5, -0.5),
              "gallo_vs_padovana" = c(0, -1, 1))
emmeans::contrast(emm, method = contr)
comparison = 3vs4:
 contrast                        estimate    SE  df z.ratio p.value
 domesticated_vs_nondomesticated  -0.5404 0.489 Inf  -1.105  0.2692
 gallo_vs_padovana                 0.4333 0.369 Inf   1.173  0.2408

comparison = 6vs9:
 contrast                        estimate    SE  df z.ratio p.value
 domesticated_vs_nondomesticated  -0.1999 0.510 Inf  -0.392  0.6950
 gallo_vs_padovana                 0.7765 0.376 Inf   2.065  0.0390

comparison = 2vs3:
 contrast                        estimate    SE  df z.ratio p.value
 domesticated_vs_nondomesticated  -0.6031 0.510 Inf  -1.183  0.2368
 gallo_vs_padovana                 0.7371 0.379 Inf   1.944  0.0519

comparison = 5vs10:
 contrast                        estimate    SE  df z.ratio p.value
 domesticated_vs_nondomesticated  -0.2377 0.521 Inf  -0.456  0.6484
 gallo_vs_padovana                 0.7458 0.381 Inf   1.956  0.0505

comparison = 3vs6:
 contrast                        estimate    SE  df z.ratio p.value
 domesticated_vs_nondomesticated  -0.0033 0.629 Inf  -0.005  0.9958
 gallo_vs_padovana                 1.0984 0.419 Inf   2.620  0.0088

Results are given on the log odds ratio (not the response) scale. 
Why the contr object is defined like that?

See https://vasishth.github.io/bayescogsci/book/ch-contr.html for an example

\[ \mu_1 - \frac{\mu_2 + \mu_3}{2} \]

\[ 1\mu_1 - (0.5\mu_2 + 0.5\mu_3) \] \[ \mathbf{+1}\mu_1 \mathbf{-0.5}\mu_2 \mathbf{-0.5}\mu_3 \]

Simulation

To actually compute the power we just need to repeat the simulation several times changing some parameters. In our case we just change the by-subject random intercept because the number of trials and chicks is fixed.

This is out simulation grid:

We used the following simulation grid:

simres |> 
    mutate(nsim = map_dbl(data, nrow)) |> 
    group_by(ns, nt, sb0) |> 
    summarise(nsim = sum(nsim)/2) |> 
    mtab()
ns nt sb0 nsim
20 4 0.1 5000
20 4 0.5 5000
20 4 1.0 5000
20 4 1.5 5000

For each iteration, we also fitted the model excluding the 3vs4 conditon and see what happens. In fact, that condition could reduce/increase power.

The script simulation.R (see below) contains the simulation code. Note that the code has been run from a cluster running in parallel.

Simulation code
# Setup -------------------------------------------------------------------

rm(list = ls())
set.seed(2024)

# Packages ----------------------------------------------------------------

library(tidyverse)
library(emmeans)
library(lme4)
library(car)
library(furrr)

# Functions ---------------------------------------------------------------

odds <- function(p) p / (1 - p)

odds_ratio <- function(pn, pd){
    # pn = probability at the numerator
    # pd = probability at the denominator
    odds(pn) / odds(pd)
}

get_Z_matrix <- function(formula, data){
    data$y <- 1 # mock y
    mock_glmer <- lFormula(formula, data)
    Z <- t(as.matrix(mock_glmer$reTrms$Zt))
    return(Z)
}

sglmer <- purrr::possibly(glmer, NA)
fit_glmer <- function(data){
    suppressMessages(glmer(cbind(nc, nf) ~ comparison * breed + (1|id), 
                            data = data, 
                            family = binomial(link = "logit"),
                            nAGQ = 0,
                            glmerControl(optimizer = "bobyqa")))
}

get_lrt <- function(fit){
    out <- data.frame(Anova(fit))
    names(out) <- c("chi", "df", "p")
    out$param <- rownames(out)
    rownames(out) <- NULL
    out
}

lor_to_d <- function(lor){
    (lor * sqrt(3)) / pi
}

get_main_effect_breed <- function(x){
    emm_source <- suppressMessages(
        data.frame(emmeans(x, pairwise ~ breed, adjust = "none")$contrast)
    )
    emm_source$or <- exp(emm_source$estimate)
    emm_source$d <- lor_to_d(emm_source$estimate)
    emm_source
}

get_emm_proportions <- function(x){
    emm <- emmeans(x, ~ breed * comparison, type = "response")
    data.frame(emm)
}

get_results <- function(x){
    if(isGLMM(x)){
        # lrt
        lrt <- get_lrt(x)
        # breed contrasts
        mf_breed <- get_main_effect_breed(x)
        # all proportions
        props <- get_emm_proportions(x)
        issue <- if(!is.null(x@optinfo$conv$lme4)) x@optinfo$conv$lme4$messages else NA_character_
    }else{
        lrt <- mf_breed <- props <- issue <- NA
    }
    res <- tibble(lrt = list(lrt), 
                  mf_breed = list(mf_breed), 
                  props = list(props),
                  issue = issue)
    return(res)
}

do_sim <- function(data, lp_fixed, N, sb0, nsim = 1){
    res <- vector(mode = "list", length = nsim)
    res_no3vs4 <- vector(mode = "list", length = nsim)
    for(i in 1:length(res)){
        b0i <- rnorm(N, 0, sb0)
        data$p <- c(plogis(lp_fixed + b0i[data$id]))
        data$nc <- rbinom(nrow(data), data$nt, data$p)
        data$nf <- data$nt - data$nc
        # creating dataset without the 3vs4 condition
        data_no3vs4 <- data[data$comparison != "3vs4", ]
        data_no3vs4$comparison <- droplevels(data_no3vs4$comparison)
        fit <- fit_glmer(data)
        fit_no3vs4 <- fit_glmer(data_no3vs4)
        res[[i]] <- get_results(fit)
        res_no3vs4[[i]] <- get_results(fit_no3vs4)
    }
    res <- do.call(rbind, res)
    res_no3vs4 <- do.call(rbind, res_no3vs4)
    res$model <- "full"
    res_no3vs4$model <- "no3vs4"
    bind_rows(res, res_no3vs4)
}


# Parameters --------------------------------------------------------------

# this is the table with the assumed accuracy for each breed and condition

hyp_table <- data.frame(
    stringsAsFactors = FALSE,
    comparison = c("2vs3", "3vs4", "5vs10", "3vs6", "6vs9"),
    hybro = c(0.61, 0.5, 0.67, 0.7, 0.8),
    ross308 = c(0.56, 0.5, 0.62, 0.65, 0.55),
    gallina_padovana = c(0.67, 0.5, 0.73, 0.82, 0.66),
    gallo_bankiva = c(0.725, 0.5, 0.785, 0.875, 0.705)
)

# calculating true effect sizes

hyp_table$padovana_vs_ross <- with(hyp_table, odds_ratio(gallina_padovana, ross308))
hyp_table$gallo_vs_ross <- with(hyp_table, odds_ratio(gallo_bankiva, gallina_padovana))

sim <- hyp_table %>% 
    select(comparison, ross308, gallina_padovana, gallo_bankiva) %>% 
    pivot_longer(2:4, names_to = "breed", values_to = "p")

sim$breed <- factor(sim$breed, levels = c("ross308", "gallina_padovana", "gallo_bankiva"))
# contrast coding for successive differences as the main hypothesis
contrasts(sim$breed) <- contr.sum(length(unique(sim$breed)))

# from the plot I see that the 3vs4 condition is assumed to be at the chance level
# we can code the contrasts as 3vs4 being the 0 and everything is related to this

comp_levels <- c("3vs4", "2vs3", "6vs9", "5vs10", "3vs6")

sim$comparison <- factor(sim$comparison, levels = comp_levels)
contrasts(sim$comparison) <- contr.sum(length(unique(sim$comparison)))

# getting parameters from linear predictors and model matrix

lp <- qlogis(sim$p) # probabilities in logit scale
Xt <- model.matrix(~ comparison * breed, data = sim)
# inverse matrix operation to find the vector of true parameters from X and Y
B <- solve(t(Xt) %*% Xt) %*% t(Xt) %*% lp

# simulation parameters

ns <- 20 # number of chiks for each breed
nt <- 4  # number of trials per condition
N <- ns * length(levels(sim$breed))
nsim <- 5e3 # number of simulations
sb0 <- c(0.1, 0.5, 1, 1.5)

# creating dataset

between <- data.frame(
    id = 1:N,
    breed = rep(unique(sim$breed), each = ns)
)
simdata <- expand_grid(between, comparison = unique(sim$comparison))
simdata$nt <- nt
simdata$breed <- factor(simdata$breed, levels = levels(sim$breed))
simdata$comparison <- factor(simdata$comparison, levels = levels(sim$comparison))
contrasts(simdata$breed) <- contr.sum(3)
contrasts(simdata$comparison) <- contr.sum(5)

# getting the model-matrix X

X <- model.matrix(~ comparison * breed, data = simdata)

# getting the random-effects matrix Z

Z <- get_Z_matrix(y ~ comparison * breed + (1|id), data = simdata)

# getting the fixed part so we don't need to calculate within each simulation iteration

lp_fixed <- X %*% B

# simulation conditions

# splitting simulations in blocks to parallelize

ncores <- 40
simgrid <- expand_grid(ns, nt, sb0)
simgrid$N <- simgrid$ns * length(unique(sim$breed))
simblocks <- ncores/nrow(simgrid)
simgrid <- expand_grid(simgrid, sblock = 1:simblocks)
simgrid$nsim <- nsim/simblocks

# simulation

plan(multicore(workers = 40))

res <- future_pmap(
    simgrid,
    ~ do_sim(data = simdata, lp_fixed = lp_fixed, N = ..4, sb0 = ..3, nsim =..6), 
    .options = furrr_options(seed = TRUE)
)

simgrid$res <- res
    
# saving

filename <- sprintf("%s_simres.rds", Sys.Date())
saveRDS(simgrid, file.path("objects", filename))
saveRDS(sim, file.path("objects", "true_probs.rds"))

Firstly let’s see the percentage of singular fit (variance estimated as 0) and non convergence:

convergence_issue <- function(x){
    ifelse(grepl("singular", x), "singular", ifelse(grepl("converge", x), "convergence", "noissue"))
}

simres |> 
    unnest(data) |> 
    mutate(which_issue = convergence_issue(issue)) |> 
    group_by(ns, nt, sb0, model) |> 
    count(which_issue) |> 
    mutate(p = n / 5000) |> 
    select(-n) |> 
    pivot_wider(names_from = which_issue, values_from = p) |> 
    mutate(singular = ifelse(is.na(singular), 0, singular)) |> 
    mtab()
ns nt sb0 model noissue singular
20 4 0.1 full 0.491 0.509
20 4 0.1 no3vs4 0.471 0.529
20 4 0.5 full 0.999 0.001
20 4 0.5 no3vs4 0.994 0.006
20 4 1.0 full 1.000 0.000
20 4 1.0 no3vs4 1.000 0.000
20 4 1.5 full 1.000 0.000
20 4 1.5 no3vs4 1.000 0.000

Only with the lowest sb0 we have a lot of singular fit (given the low sample size and number of trials). There are no convergence issues.

Power for main effects and interactions

lrt_power <- simres |> 
    unnest(data) |> 
    unnest(lrt) |> 
    group_by(ns, nt, sb0, param, model) |> 
    summarise(power = mean(p <= 0.05)) |>
    arrange(model)
Code
simres |> 
    unnest(data) |> 
    unnest(lrt) |> 
    group_by(ns, nt, sb0, param, model) |> 
    summarise(power = mean(p <= 0.05)) |> 
    ggplot(aes(x = factor(sb0), y = power, color = model, group = model)) +
    facet_grid(~param) +
    geom_point(size = 3) +
    geom_line() +
    xlab("Standard Deviation Intercepts (logit)") +
    theme_minimal(15) +
    theme(legend.position = "bottom")

Power for post-hoc breed contrasts

Let’s see now the power for the post-hoc of the breed main effect:

mf_breed <- simres |> 
    unnest(data) |> 
    unnest(mf_breed) |> 
    group_by(ns, nt, sb0, contrast, model) |> 
    summarise(power = mean(p.value <= 0.05))
Code
mf_breed |> 
    ggplot(aes(x = factor(sb0), y = power, color = model)) +
    facet_wrap(~contrast, ncol = 1) +
    geom_point(size = 3) +
    geom_line(aes(group = model))