Designing a study with multiple end-points

Premise

Here, we will use DRpower to help us design a prospective malaria molecular surveillance (MMS) study. We are interested in the following questions:

  1. What is the prevalence of sulfadoxine/pyrimethamine (SP) mutations?
  2. Are there any pfkelch13 mutations in our region?
  3. What is the prevalence of pfhrp2/3 gene deletions, and is this prevalence above 5% (the WHO threshold at which a nationwide change in RDTs is recommended)?

Based on initial budget calculations, we are aiming to collect 100 samples from each of 3 sites in each of 10 provinces. We want to know what sort of power we can expect with these numbers for each of the end-points above.

SP mutation prevalence - margin of error (MoE) calculation

Our plan is to pool results over the 3 sites within each province to arrive at an estimate of the overall prevalence of SP mutations at the province level. When pooling results it is important to take account of intra-cluster correlation (ICC) in our confidence intervals (CIs). Failure to do so will lead to unrealistically tight intervals. DRpower provides several functions for calculating the expected margin of error (MoE) given assumptions about sample size, the ICC, and the known prevalence of mutations.

First, let’s load the package and set the random seed so our results are reproducible:

library(DRpower)
set.seed(1)

We will use the get_margin() function to tell us the MoE we can expect if the prevalence of a mutation is 10% at the province level. We set the parameters to define 3 clusters of 100 samples. We assume an ICC value of 0.05:

# get MoE using simple calculation
get_margin(N = 100, n_clust = 3, prevalence = 0.1, ICC = 0.05)
    lower     upper 
 1.719297 18.280703 

This output tells us that we expect our CI to range from 1.7% at the lower end to 18.3% at the upper end when estimating a variant at 10% prevalence. In other words, we expect a MoE of 8.3%. This is precise enough for most purposes.

Our worst case MoE will occur when the prevalence of the marker is 50%. We can explore this here:

# get worst case MoE
get_margin(N = 100, n_clust = 3, prevalence = 0.5, ICC = 0.05)
   lower    upper 
36.19883 63.80117 

In the worst case our MoE goes up to \(\pm\) 14%, which is still reasonably precise.

However, this approach assumes that we know the level of intra-cluster correlation exactly. It also uses a normal approximation when producing CIs, which is a bad approximation when the prevalence is very low or very high. A more cautious approach is to estimate prevalence using the Bayesian model within DRpower. If we plan to use this approach, then we should use the get_margin_Bayesian() function to estimate the MoE:

# get expected MoE via the Bayesian approach
get_margin_Bayesian(N = rep(100, 3), prevalence = 0.1, reps = 1e2)
      estimate    CI_2.5   CI_97.5
lower   3.8705  3.391827  4.349173
upper  25.1018 23.891353 26.312247

We should look at the first column of this data.frame when comparing to the previous values. We now find that our credible interval (CrI) is expected to range from 3.8% to 25.1%. This is quite different from what we found under the simpler get_margin() method, being tighter at low prevalence but wider at high prevalence. Note that unlike the previous function these are estimated values, hence they are presented along with a 95% CI. We can make estimate more precise if needed by increasing reps.

We can explore our worst case MoE using the same method:

# get worst case MoE via the Bayesian approach
get_margin_Bayesian(N = rep(100, 3), prevalence = 0.5, reps = 1e2)
      estimate   CI_2.5  CI_97.5
lower  33.4947 31.92120 35.06820
upper  65.5736 64.12608 67.02112

In the worst case we can expect our CI to range from 33.5% to 66.6%, a MoE of around \(\pm\) 16.5%. This is slightly higher than before, but still precise enough for most purposes.

pfkelch13 mutations - presence/absence analysis

When it comes to pfkelch13, we are interested in knowing if there are any mutants in the province, rather than estimating the prevalence of mutants. A single positive sample would prove that there are mutants in the province, however, if the prevalence is very low we may get unlucky and miss them. We can use the get_power_presence() function to explore our probability of detecting at least one positive sample:

# get power when prevalence is 1%
get_power_presence(N = rep(100, 3), prevalence = 0.01, ICC = 0.05)
[1] 65.38844

We can see that if pfkelch13 mutants were at 1% prevalence at the province level then we would have a 65% chance of detecting one in any of our 3 sites. This is quite low power - we normally aim for at least 80%. Compare that to the result if we assume mutants are at 2% in the province:

# get power when prevalence is 2%
get_power_presence(N = rep(100, 3), prevalence = 0.02, ICC = 0.05)
[1] 88.08013

We now have close to 90% power. So, we can say we are powered to detect rare variants down to around 2% prevalence, but if they are less common than this we risk missing them. Whether or not this is an acceptable level of sensitivity is a programmatic decision.

pfhrp2/3 deletions - comparison against 5% threshold

We plan to follow the WHO Master Protocol, switching RDTs at a national level if any province has a prevalence of pfhrp2/3 deletions significantly over 5%. We can use the get_power_threshold() function to calculate our power under this model. We assume here that the prevalence of deletions at the province level is 10%:

# estimate power under WHO approach
get_power_threshold(N = rep(100, 3), prevalence = 0.1, prev_thresh = 0.05, ICC = 0.05, reps = 1e3)
  prev_thresh power lower upper
1        0.05  59.9 56.79 62.95

We only have around 60% power under this design. We would obtain a higher power if we assumed the prevalence of pfhrp2/3 deletions was higher than 10%, but we would need to be able to justify this assumption with evidence - for example results of a pilot study, or results from neighbouring countries.

In our case, we will instead abandon the threshold comparison approach in favour of a two-step design. Our new plan will be to look for the presence of any pfhrp2/3 deletions at the province level, and only if they are detected we will perform a follow-up study in that province. The nice thing about this is that we have already performed this power calculation, as it becomes a presence/absence analysis exactly as we did for pfkelch13. We therefore know that we have power to detect single deletions down to around 2% prevalence.

For our follow-up study, we can explore what would happen if we doubled the number of sites in a province:

# estimate power with 6 sites of 100
get_power_threshold(N = rep(100, 6), prevalence = 0.1, prev_thresh = 0.05, ICC = 0.05, reps = 1e3)
  prev_thresh power lower upper
1        0.05  78.4 75.72 80.91

We find that power is around 79%, which is acceptable. We will therefore go ahead with this two-stage design.

Summary

We have used DRpower to explore the precision and power we can expect for a study design consisting of 100 samples in each of 3 sites within a province. This has revealed that:

  1. When estimating SP mutations we will have good precision at the province level, probably with a MoE of around \(\pm\) 15%.
  2. For pfkelch13 mutations, we are powered to detect rare variants down to around 2%. Any lower than this and we run the risk of missing them.
  3. For pfhrp2/3 deletions, we do not have power under the WHO Master Protocol approach. We have therefore modified our study design to a two-stage process that focuses first on detection, and then on prevalence estimation. In provinces where deletions are identified we will run a second phase with at least 6 sites of 100 samples.
Back to top