This function defines settings for estimate_R It takes a list of named items as input, set defaults where arguments are missing, and return a list of settings.
make_config(
...,
incid = NULL,
method = c("non_parametric_si", "parametric_si", "uncertain_si", "si_from_data",
"si_from_sample")
)
Acceptable arguments for ... are:
Vector of positive integers giving the starting times of each
window over which the reproduction number will be estimated. These must be in
ascending order, and so that for all i
, t_start[i]<=t_end[i]
.
t_start[1] should be strictly after the first day with non null incidence.
Vector of positive integers giving the ending times of each
window over which the reproduction number will be estimated. These must be
in ascending order, and so that for all i
,
t_start[i]<=t_end[i]
.
For method "uncertain_si" and "si_from_data"; positive integer giving the size of the sample of SI distributions to be drawn (see details).
For methods "uncertain_si", "si_from_data" and "si_from_sample"; positive integer giving the size of the sample drawn from the posterior distribution of R for each serial interval distribution considered (see details).
For method "parametric_si" and "uncertain_si" ; positive real giving the mean serial interval (method "parametric_si") or the average mean serial interval (method "uncertain_si", see details).
For method "parametric_si" and "uncertain_si" ; non negative real giving the standard deviation of the serial interval (method "parametric_si") or the average standard deviation of the serial interval (method "uncertain_si", see details).
For method "uncertain_si" ; standard deviation of the distribution from which mean serial intervals are drawn (see details).
For method "uncertain_si" ; lower bound of the distribution from which mean serial intervals are drawn (see details).
For method "uncertain_si" ; upper bound of the distribution from which mean serial intervals are drawn (see details).
For method "uncertain_si" ; standard deviation of the distribution from which standard deviations of the serial interval are drawn (see details).
For method "uncertain_si" ; lower bound of the distribution from which standard deviations of the serial interval are drawn (see details).
For method "uncertain_si" ; upper bound of the distribution from which standard deviations of the serial interval are drawn (see details).
For method "non_parametric_si" ; vector of probabilities
giving the discrete distribution of the serial interval, starting with
si_distr[1]
(probability that the serial interval is zero), which
should be zero.
For method "si_from_data" ; the parametric distribution to use when estimating the serial interval from data on dates of symptoms of pairs of infector/infected individuals (see details). Should be one of "G" (Gamma), "W" (Weibull), "L" (Lognormal), "off1G" (Gamma shifted by 1), "off1W" (Weibull shifted by 1), or "off1L" (Lognormal shifted by 1).
An object of class estimate_R_mcmc_control
, as
returned by function make_mcmc_control
.
An optional integer used as the seed for the random number
generator at the start of the function (then potentially reset within the
MCMC for method si_from_data
); useful to get reproducible results.
A positive number giving the mean of the common prior distribution for all reproduction numbers (see details).
A positive number giving the standard deviation of the common prior distribution for all reproduction numbers (see details).
A positive number giving the aimed posterior coefficient of variation (see details).
As in functionestimate_R
.
As in functionestimate_R
.
An object of class estimate_R_config
with components
t_start, t_end, n1, n2, mean_si, std_si,
std_mean_si, min_mean_si, max_mean_si, std_std_si, min_std_si, max_std_si,
si_distr, si_parametric_distr, mcmc_control, seed, mean_prior, std_prior,
cv_posterior, which can be used as an argument of function estimate_R
.
Analytical estimates of the reproduction number for an epidemic over
predefined time windows can be obtained using function estimate_R
,
for a given discrete distribution of the serial interval. make_config
allows to generate a configuration specifying the way the estimation will
be performed.
The more incident cases are observed over a time window, the smallest the
posterior coefficient of variation (CV, ratio of standard deviation over
mean) of the reproduction number.
An aimed CV can be specified in the argument cv_posterior
(default is 0.3
), and a warning will be produced if the incidence
within one of the time windows considered is too low to get this CV.
The methods vary in the way the serial interval distribution is specified.
In short there are five methods to specify the serial interval distribution (see below for details on each method). In the first two methods, a unique serial interval distribution is considered, whereas in the last three, a range of serial interval distributions are integrated over:
In method "non_parametric_si" the user specifies the discrete distribution of the serial interval
In method "parametric_si" the user specifies the mean and sd of the serial interval
In method "uncertain_si" the mean and sd of the serial interval are each drawn from truncated normal distributions, with parameters specified by the user
In method "si_from_data", the serial interval distribution is directly estimated, using MCMC, from interval censored exposure data, with data provided by the user together with a choice of parametric distribution for the serial interval
In method "si_from_sample", the user directly provides the sample of serial interval distribution to use for estimation of R. This can be a useful alternative to the previous method, where the MCMC estimation of the serial interval distribution could be run once, and the same estimated SI distribution then used in estimate_R in different contexts, e.g. with different time windows, hence avoiding to rerun the MCMC everytime estimate_R is called.
———————– method "non_parametric_si"
——————-
The discrete distribution of the serial interval is directly specified in the
argument si_distr
.
———————– method "parametric_si"
———————–
The mean and standard deviation of the continuous distribution of the serial
interval are given in the arguments mean_si
and std_si
.
The discrete distribution of the serial interval is derived automatically
using discr_si
.
———————– method "uncertain_si"
———————–
Method "uncertain_si"
allows accounting for uncertainty on the serial
interval distribution as described in Cori et al. AJE 2013.
We allow the mean \(\mu\) and standard deviation \(\sigma\) of the serial
interval to vary according to truncated normal distributions.
We sample n1
pairs of mean and standard deviations,
\((\mu^{(1)},\sigma^{(1)}),...,(\mu^{(n_2)},\sigma^{(n_2)})\), by first
sampling the mean \(\mu^{(k)}\)
from its truncated normal distribution (with mean mean_si
, standard
deviation std_mean_si
, minimum min_mean_si
and maximum
max_mean_si
),
and then sampling the standard deviation \(\sigma^{(k)}\) from its
truncated normal distribution
(with mean std_si
, standard deviation std_std_si
, minimum
min_std_si
and maximum max_std_si
), but imposing that
\(\sigma^{(k)}<\mu^{(k)}\).
This constraint ensures that the Gamma probability density function of the
serial interval is null at \(t=0\).
Warnings are produced when the truncated normal distributions are not
symmetric around the mean.
For each pair \((\mu^{(k)},\sigma^{(k)})\), we then draw a sample of size
n2
in the posterior distribution of the reproduction number over each
time window, conditionally on this serial interval distribution.
After pooling, a sample of size \(\code{n1}\times\code{n2}\) of the joint
posterior distribution of the reproduction number over each time window is
obtained.
The posterior mean, standard deviation, and 0.025, 0.05, 0.25, 0.5, 0.75,
0.95, 0.975 quantiles of the reproduction number for each time window are
obtained from this sample.
———————– method "si_from_data"
———————–
Method "si_from_data"
allows accounting for uncertainty on the serial
interval distribution.
Unlike method "uncertain_si", where we arbitrarily vary the mean and std of
the SI in truncated normal distributions,
here, the scope of serial interval distributions considered is directly
informed by data
on the (potentially censored) dates of symptoms of pairs of infector/infected
individuals.
This data, specified in argument si_data
, should be a dataframe with 5
columns:
EL: the lower bound of the symptom onset date of the infector (given as an integer)
ER: the upper bound of the symptom onset date of the infector (given as an integer). Should be such that ER>=EL. If the dates are known exactly use ER = EL
SL: the lower bound of the symptom onset date of the infected individual (given as an integer)
SR: the upper bound of the symptom onset date of the infected individual (given as an integer). Should be such that SR>=SL. If the dates are known exactly use SR = SL
type (optional): can have entries 0, 1, or 2, corresponding to doubly interval-censored, single interval-censored or exact observations, respectively, see Reich et al. Statist. Med. 2009. If not specified, this will be automatically computed from the dates
Assuming a given parametric distribution for the serial interval distribution
(specified in si_parametric_distr),
the posterior distribution of the serial interval is estimated directly from
these data using MCMC methods implemented in the package
coarsedatatools
.
The argument mcmc_control
is a list of characteristics which control
the MCMC.
The MCMC is run for a total number of iterations of
mcmc_control$burnin + n1*mcmc_control$thin
;
but the output is only recorded after the burnin, and only 1 in every
mcmc_control$thin
iterations,
so that the posterior sample size is n1
.
For each element in the posterior sample of serial interval distribution,
we then draw a sample of size n2
in the posterior distribution of the
reproduction number over each time window,
conditionally on this serial interval distribution.
After pooling, a sample of size \(\code{n1}\times\code{n2}\) of the joint
posterior distribution of
the reproduction number over each time window is obtained.
The posterior mean, standard deviation, and 0.025, 0.05, 0.25, 0.5, 0.75,
0.95, 0.975 quantiles of the reproduction number for each time window are
obtained from this sample.
———————– method "si_from_sample"
———————-
Method "si_from_sample"
also allows accounting for uncertainty on the
serial interval distribution.
Unlike methods "uncertain_si" and "si_from_data", the user directly provides
(in argument si_sample
) a sample of serial interval distribution to be
explored.
if (FALSE) { # \dontrun{
## Note the following examples use an MCMC routine
## to estimate the serial interval distribution from data,
## so they may take a few minutes to run
## load data on rotavirus
data("MockRotavirus")
## estimate the reproduction number (method "si_from_data")
## we are not specifying the time windows, so by defaults this will estimate
## R on sliding weekly windows
incid <- MockRotavirus$incidence
method <- "si_from_data"
config <- make_config(incid = incid,
method = method,
list(si_parametric_distr = "G",
mcmc_control = make_mcmc_control(burnin = 1000,
thin = 10, seed = 1),
n1 = 500,
n2 = 50,
seed = 2))
R_si_from_data <- estimate_R(incid,
method = method,
si_data = MockRotavirus$si_data,
config = config)
plot(R_si_from_data)
## you can also create the config straight within the estimate_R call,
## in that case incid and method are automatically used from the estimate_R
## arguments:
R_si_from_data <- estimate_R(incid,
method = method,
si_data = MockRotavirus$si_data,
config = make_config(
list(si_parametric_distr = "G",
mcmc_control = make_mcmc_control(burnin = 1000,
thin = 10, seed = 1),
n1 = 500,
n2 = 50,
seed = 2)))
plot(R_si_from_data)
} # }