Estimate power when testing prevalence against a threshold

Estimates power when conducting a clustered prevalence survey and comparing against a set threshold. Estimates power empirically via repeated simulation. Returns an estimate of the power, along with lower and upper 95% confidence interval of this estimate.

get_power_threshold(
  N,
  prevalence = 0.1,
  ICC = 0.05,
  prev_thresh = 0.05,
  rejection_threshold = 0.95,
  ICC_infer = NULL,
  prior_prev_shape1 = 1,
  prior_prev_shape2 = 1,
  prior_ICC_shape1 = 1,
  prior_ICC_shape2 = 9,
  n_intervals = 20,
  round_digits = 2,
  reps = 100,
  use_cpp = TRUE,
  silent = FALSE
)

Arguments

N: vector of the number of samples obtained from each cluster.
prevalence: assumed true prevalence of pfhrp2/3 deletions as a proportion between 0 and 1. If a vector of two values is given here then prevalence is drawn uniformly from between these limits independently for each simulation. This allows power to be calculated for a composite hypothesis.
ICC: assumed true intra-cluster correlation (ICC) as a value between 0 and 1.
prev_thresh: the threshold prevalence that we are testing against (5% by default).
rejection_threshold: the posterior probability of being above the prevalence threshold needs to be greater than rejection_threshold in order to reject the null hypothesis.
ICC_infer: the value of the ICC assumed in the inference step. If we plan on estimating the ICC from our data, i.e. running get_prevalence(ICC = NULL) (the default), then we should also set ICC=NULL here (the default). However, if we plan on running get_prevalence() with ICC set to a known value then we should insert this value here as ICC_infer.
prior_prev_shape1, prior_prev_shape2, prior_ICC_shape1, prior_ICC_shape2: parameters that dictate the shape of the Beta priors on prevalence and the ICC. See the Wikipedia page on the Beta distribution for more detail. The default values of these parameters were chosen based on an analysis of historical pfhrp2/3 studies, although this does not guarantee that they will be suitable in all settings.
n_intervals: the number of intervals used in the adaptive quadrature method. Increasing this value gives a more accurate representation of the true posterior, but comes at the cost of reduced speed.
round_digits: the number of digits after the decimal point that are used when reporting estimates. This is to simplify results and to avoid giving the false impression of extreme precision.
reps: number of times to repeat simulation per parameter combination.
use_cpp: if TRUE (the default) then use an Rcpp implementation of the adaptive quadrature approach that is much faster than the base R method.
silent: if TRUE then suppress all console output.

Details

Estimates power using the following approach:

Simulate data via the function rbbinom_reparam() using known values (e.g. a known "true" prevalence and intra-cluster correlation).
Analyse data using get_prevalence() to determine the probability of being above prev_thresh.
If this probability is above rejection_threshold then reject the null hypothesis. Encode this as a single correct conclusion.
Repeat steps 1-3 many times. Count the number of simulations for which the correct conclusion is reached, and divide by the total number of simulations. This gives an estimate of empirical power, along with upper and lower 95% binomial CIs via the method of Clopper and Pearson (1934).

Note that this function can be run even when prevalence is less than prev_thresh, although in this case what is returned is not the power. Power is defined as the probability of correctly rejecting the null hypothesis, whereas here we would be incorrectly rejecting the null. Therefore, what we obtain in this case is an estimate of the false positive rate.

References

Clopper, C.J. and Pearson, E.S., 1934. The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika, 26, 404–413. doi: 10.2307/2331986.

Examples

get_power_threshold(N = c(120, 90, 150), prevalence = 0.15, ICC = 0.1 , reps = 1e2)
#> 
|====                                                  |  8% ~1 s remaining     
|========                                              | 16% ~1 s remaining     
|============                                          | 24% ~1 s remaining     
|=================                                     | 32% ~0 s remaining     
|=====================                                 | 40% ~0 s remaining     
|=========================                             | 48% ~0 s remaining     
|==============================                        | 56% ~0 s remaining     
|==================================                    | 64% ~0 s remaining     
|======================================                | 72% ~0 s remaining     
|===========================================           | 80% ~0 s remaining     
|===============================================       | 88% ~0 s remaining     
|===================================================   | 96% ~0 s remaining     
Completed after 1 s                                                             
#>   prev_thresh power lower upper
#> 1        0.05    84 75.32 90.57