mcstate
AKA, the particle filter
Assuming a given \(\theta\), at each time step \(t\), BSSMC:
Allow to explores efficiently the state space by progressively integrating the data points
Produces a MC approximation of \(p(Y_{1:T}|\theta)\) the marginal likelihood
mcstate
is an R package based on the SSM paradigm which aims to provide inference and filtering tools for dust modelsOur requirements
(for PMCMC you also need parameters to infer, …later)
N <- S + I + R
p_SI <- 1 - exp(-(beta) * I / N)
p_IR <- 1 - exp(-(gamma))
n_IR <- rbinom(I, p_IR * dt)
n_SI <- rbinom(S, p_SI * dt)
update(time) <- (step + 1) * dt
update(S) <- S - n_SI
update(I) <- I + n_SI - n_IR
update(R) <- R + n_IR
update(cases_cumul) <- cases_cumul + n_SI
update(cases_inc) <- if (step %% freq == 0) n_SI else cases_inc + n_SI
initial(time) <- 0
initial(S) <- 1000
initial(R) <- 0
initial(I) <- I0
initial(cases_cumul) <- 0
initial(cases_inc) <- 0
beta <- user(0.2)
gamma <- user(0.1)
I0 <- user(10)
freq <- user(4)
dt <- 1.0 / freq
* installing *source* package ‘sira5d4bd1a’ ...
** using staged installation
** libs
g++ -std=gnu++11 -I"/usr/share/R/include" -DNDEBUG -I'/home/rfitzjoh/lib/R/library/cpp11/include' -g -Wall -Wextra -pedantic -Wmaybe-uninitialized -Wno-unused-parameter -Wno-cast-function-type -Wno-missing-field-initializers -O2 -I/home/rfitzjoh/lib/R/library/dust/include -DHAVE_INLINE -fopenmp -fpic -g -O2 -fdebug-prefix-map=/build/r-base-a3XuZ5/r-base-4.2.2.20221110=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -c cpp11.cpp -o cpp11.o
g++ -std=gnu++11 -I"/usr/share/R/include" -DNDEBUG -I'/home/rfitzjoh/lib/R/library/cpp11/include' -g -Wall -Wextra -pedantic -Wmaybe-uninitialized -Wno-unused-parameter -Wno-cast-function-type -Wno-missing-field-initializers -O2 -I/home/rfitzjoh/lib/R/library/dust/include -DHAVE_INLINE -fopenmp -fpic -g -O2 -fdebug-prefix-map=/build/r-base-a3XuZ5/r-base-4.2.2.20221110=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -c dust.cpp -o dust.o
g++ -std=gnu++11 -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro -o sira5d4bd1a.so cpp11.o dust.o -fopenmp -L/usr/lib/R/lib -lR
installing to /tmp/Rtmp0uMqpV/devtools_install_2537dc153befd2/00LOCK-file2537dc5e692547/00new/sira5d4bd1a/libs
** checking absolute paths in shared objects and dynamic libraries
* DONE (sira5d4bd1a)
pars <- list(beta = 0.25, gamma = 0.1)
mod <- sir$new(pars, 0, 20)
y <- mod$simulate(c(0, data$time_end))
i <- mod$info()$index[["time"]]
j <- mod$info()$index[["cases_inc"]]
matplot(y[i, 1, ], t(y[j, , ]), type = "l", col = "#00000055", lty = 1, las = 1,
xlab = "Day", ylab = "Cases")
points(cases ~ day, incidence, col = "red", pch = 19)
This is the important bit, and something that is a trick to write well.
incidence.csv
- daily case informationsir.R
- a simple SIR model with incidenceindex.R
- an index functioncompare.R
- a compare functionfilter <- mcstate::particle_filter$new(data, model = sir, n_particles = 10,
compare = compare, index = index)
sort(replicate(10, filter$run(pars)))
[1] -267.1850 -263.6292 -262.8457 -261.7034 -259.9648 -259.5186 -258.7828
[8] -258.0611 -254.4378 -251.7591
filter <- mcstate::particle_filter$new(data, model = sir, n_particles = 1000,
compare = compare, index = index)
sort(replicate(10, filter$run(pars)))
[1] -255.4523 -255.2030 -254.2859 -254.2029 -254.1420 -254.1420 -253.9215
[8] -253.8245 -253.6417 -253.6241
First, run the filter while saving history (off by default)
(this will improve in future, feedback very welcome)
Convert “MCMC parameters” into “model parameters”
You will want closures in complex models:
control <- mcstate::pmcmc_control(
n_steps = 500,
progress = TRUE)
samples <- mcstate::pmcmc(mcmc_pars, filter, control = control)
samples
<mcstate_pmcmc> (500 samples)
pars: 500 x 2 matrix of parameters
beta, gamma
probabilities: 500 x 3 matrix of log-probabilities
log_prior, log_likelihood, log_posterior
state: 6 x 500 matrix of final states
trajectories: (not included)
restart: (not included)
oh.
This is useful because most of your chain is not interesting due to the autocorrelation.
Arguments to mcstate::pmcmc_control
n_chains
: number of separate chains to runn_threads_total
: total number of threads to usen_workers
: number of separate threads to split your chains overuse_parallel_seed
: helps with reproducibilityYou can also run different chains on different cluster nodes - but talk to us about this.
vcv <- matrix(c(0.00057, 0.00052, 0.00052, 0.00057), 2, 2)
mcmc_pars <- mcstate::pmcmc_parameters$new(priors, vcv, transform)
control <- mcstate::pmcmc_control(
n_steps = 500,
n_chains = 4,
n_threads_total = 12,
n_workers = 4,
save_state = TRUE,
save_trajectories = TRUE,
progress = TRUE)
samples <- mcstate::pmcmc(mcmc_pars, filter, control = control)
plot(samples$probabilities[, "log_posterior"], type = "s")
A nice PMCMC introduction written for the epidemiologist [Endo, A., van Leeuwen, E. & Baguelin, M. Introduction to particle Markov-chain Monte Carlo for disease dynamics modellers. Epidemics 29, 100363 (2019).] (https://www.sciencedirect.com/science/article/pii/S1755436519300301?via%3Dihub)
A tutorial about SMC Doucet, A. & Johansen, A. M. A Tutorial on Particle filtering and smoothing: Fiteen years later. Oxford Handb. nonlinear Filter. 656–705 (2011). doi:10.1.1.157.772
The reference paper on PMCMC Andrieu, C., Doucet, A. & Holenstein, R. Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B (Statistical Methodol. 72, 269–342 (2010).
A software oriented paper introducing odin, dust and mcstate R. G. FitzJohn et al. Reproducible parallel inference and simulation of stochastic state space models using odin, dust, and mcstate. Wellcome Open Res. 2021 5288 5, 288 (2021).