Worked Example 3 - Generating Epidemiological Data Sets
BWorkedExampleCEpiDatasetGenerate.Rmd
This worked example demonstrates how to use the package to generate a set of epidemiological data (serological and/or annual reported severe/fatal cases) for specified years and regions (and age ranges, in the case of serological data).
This can be set up to either:
- Match a set of observed data supplied in appropriate formats
- Create a hypothetical set of data with no existing counterpart by matching to “dummy” data which presents dates, regions and/or age ranges in the appropriate formats
We first load the input data set, regional environmental covariate values, coefficients of environmental covariate values used to calculate spillover FOI and R0 values, and observed serological and annual case data (here generated by the model itself):
library(YEP)
input_data <- readRDS(file = paste(path.package("YEP"),
"/exdata/input_data_example.Rds", sep = ""))
enviro_data <- read.csv(file = paste(path.package("YEP"),
"/exdata/enviro_data_example.csv", sep = ""),
header = TRUE)
enviro_coeffs <- read.csv(file = paste(path.package("YEP"),
"/exdata/enviro_coeffs_example.csv",
sep = ""), header = TRUE)
# Seroprevalence data for comparison, by region, year & age group, in format
# no. samples/no. positives
sero_template <- read.csv(file = paste(path.package("YEP"),
"/exdata/sero_template_example.csv",
sep = ""), header = TRUE)
# Annual reported case/death data for comparison, by region and year, in format
# no. cases/no. deaths
case_template <- read.csv(file = paste(path.package("YEP"),
"/exdata/case_template_example.csv",
sep = ""), header = TRUE)
We calculate values of spillover FOI and R0 as described in Guide 2 - Calculating Parameters From Environmental Data:
n_regions <- nrow(enviro_data)
FOI_values=R0_values=rep(NA,n_regions)
for(n_region in 1:n_regions){
FOI_R0_values <- param_calc_enviro(enviro_coeffs = enviro_coeffs[1, ],
enviro_covar_values = enviro_data[n_region,c(2:ncol(enviro_data))])
FOI_values[n_region] <- FOI_R0_values$FOI
R0_values[n_region] <- FOI_R0_values$R0
}
We set the non-region-specific parameters for the dataset generation:
vaccine_efficacy <- 1.0 # Vaccine efficacy
p_severe_inf = 0.12 # Probability of an infection causing severe symptoms
p_death_severe_inf = 0.39 # Probability of an infection with severe symptoms causing death
p_rep_severe = 0.1 # Probability of reporting of an infection with severe (non-fatal) symptoms
p_rep_death = 0.2 # Probability of reporting of a fatal infection
mode_start <- 1 #
start_SEIRV <- NULL #
dt <- 1.0 # Time increment in days
n_reps <- 1 # Number of stochastic repetitions
# True/false flag indicating whether or not to run model in deterministic mode
# (so that binomial calculations give average instead of randomized output)
deterministic = FALSE
# Variable to set different modes for running on multiple processors simultaneously;
# here set to "none" so that parallel processing is not used
mode_parallel="none"
We run the Generate_Dataset() function to produce the dataset. This uses the approaches described in Guide 4 - Generating Epidemiological Data From Model Output to generate serological and/or case data.
set.seed(1)
dataset1 <- Generate_Dataset(input_data,FOI_values,R0_values,sero_template,case_template,vaccine_efficacy,
p_severe_inf,p_death_severe_inf,p_rep_severe,p_rep_death,mode_start,
start_SEIRV,dt,n_reps,deterministic,mode_parallel,cluster=NULL)
[TBA]