Create an adaptive Metropolis-Hastings sampler, which will tune its variance covariance matrix (vs the simple random walk sampler monty_sampler_random_walk).
Usage
monty_sampler_adaptive(
initial_vcv,
initial_vcv_weight = 1000,
initial_scaling = 1,
initial_scaling_weight = NULL,
min_scaling = 0,
scaling_increment = NULL,
log_scaling_update = TRUE,
acceptance_target = 0.234,
forget_rate = 0.2,
forget_end = Inf,
adapt_end = Inf,
pre_diminish = 0,
boundaries = "reflect"
)
Arguments
- initial_vcv
An initial variance covariance matrix; we'll start using this in the proposal, which will gradually become more weighted towards the empirical covariance matrix calculated from the chain.
- initial_vcv_weight
Weight of the initial variance-covariance matrix used to build the proposal of the random-walk. Higher values translate into higher confidence of the initial variance-covariance matrix and means that update from additional samples will be slower.
- initial_scaling
The initial scaling of the variance covariance matrix to be used to generate the multivariate normal proposal for the random-walk Metropolis-Hastings algorithm. To generate the proposal matrix, the weighted variance covariance matrix is multiplied by the scaling parameter squared times 2.38^2 / n_pars (where n_pars is the number of fitted parameters). Thus, in a Gaussian target parameter space, the optimal scaling will be around 1.
- initial_scaling_weight
The initial weight used in the scaling update. The scaling weight will increase after the first
pre_diminish
iterations, and as the scaling weight increases the adaptation of the scaling diminishes. IfNULL
(the default) the value is 5 / (acceptance_target * (1 - acceptance_target)).- min_scaling
The minimum scaling of the variance covariance matrix to be used to generate the multivariate normal proposal for the random-walk Metropolis-Hastings algorithm.
- scaling_increment
The scaling increment which is added or subtracted to the scaling factor of the variance-covariance after each adaptive step. If
NULL
(the default) then an optimal value will be calculated.- log_scaling_update
Logical, whether or not changes to the scaling parameter are made on the log-scale.
- acceptance_target
The target for the fraction of proposals that should be accepted (optimally) for the adaptive part of the chain.
- forget_rate
The rate of forgetting early parameter sets from the empirical variance-covariance matrix in the MCMC chains. For example,
forget_rate = 0.2
(the default) means that once in every 5th iterations we remove the earliest parameter set included, so would remove the 1st parameter set on the 5th update, the 2nd on the 10th update, and so on. Settingforget_rate = 0
means early parameter sets are never forgotten.- forget_end
The final iteration at which early parameter sets can be forgotten. Setting
forget_rate = Inf
(the default) means that the forgetting mechanism continues throughout the chains. Forgetting early parameter sets becomes less useful once the chains have settled into the posterior mode, so this parameter might be set as an estimate of how long that would take.- adapt_end
The final iteration at which we can adapt the multivariate normal proposal. Thereafter the empirical variance-covariance matrix, its scaling and its weight remain fixed. This allows the adaptation to be switched off at a certain point to help ensure convergence of the chain.
- pre_diminish
The number of updates before adaptation of the scaling parameter starts to diminish. Setting
pre_diminish = 0
means there is diminishing adaptation of the scaling parameter from the offset, whilepre_diminish = Inf
would mean there is never diminishing adaptation. Diminishing adaptation should help the scaling parameter to converge better, but while the chains find the location and scale of the posterior mode it might be useful to explore with it switched off.- boundaries
Control the behaviour of proposals that are outside the model domain. The supported options are:
"reflect" (the default): we reflect proposed parameters that lie outside the domain back into the domain (as many times as needed)
"reject": we do not evaluate the density function, and return
-Inf
for its density instead."ignore": evaluate the point anyway, even if it lies outside the domain.
The initial point selected will lie within the domain, as this is enforced by monty_sample.
Value
A monty_sampler
object, which can be used with
monty_sample
Details
Efficient exploration of the parameter space during an MCMC might be difficult when the target distribution is of high dimensionality, especially if the target probability distribution present a high degree of correlation. Adaptive schemes are used to "learn" on the fly the correlation structure by updating the proposal distribution by recalculating the empirical variance-covariance matrix and rescale it at each adaptive step of the MCMC.
Our implementation of an adaptive MCMC algorithm is based on an adaptation of the "accelerated shaping" algorithm in Spencer (2021). The algorithm is based on a random-walk Metropolis-Hastings algorithm where the proposal is a multi-variate Normal distribution centred on the current point.
Spencer SEF (2021) Accelerating adaptation in the adaptive Metropolis–Hastings random walk algorithm. Australian & New Zealand Journal of Statistics 63:468-484.
Examples
m <- monty_example("gaussian", matrix(c(1, 0.5, 0.5, 2), 2, 2))
vcv <- diag(2) * 0.1
# Sampling with a random walk
s_rw <- monty_sampler_random_walk(vcv)
res_rw <- monty_sample(m, s_rw, 1000)
s_adapt <- monty_sampler_adaptive(vcv)
res_adapt <- monty_sample(m, s_adapt, 1000)
plot(drop(res_adapt$density), type = "l", col = 4)
lines(drop(res_rw$density), type = "l", col = 2)
# Estimated vcv from the sampler at the end of the simulation
s_adapt$details[[1]]$vcv
#> NULL
coda::effectiveSize(coda::as.mcmc.list(res_rw))
#> a b
#> 27.12272 19.04789
coda::effectiveSize(coda::as.mcmc.list(res_adapt))
#> a b
#> 127.0028 106.9170