Parallel Tempering Sampler — monty_sampler_parallel

Create a "parallel tempering" sampler, which runs multiple chains at once to try and improve mixing, or takes advantage of vectorisation/parallelisation if your underlying model supports it. Currently uses a random walk sampler monty_sampler_random_walk as the underlying sampler, but we will make this configurable in a future version.

Usage

monty_sampler_parallel_tempering(n_rungs, vcv, base = NULL)

Arguments

n_rungs: The number of extra chains to run, must be at least 1.
vcv: The variance covariance matrix for the random walk sampler.
base: An optional base model, which must be provided if your model cannot be automatically decomposed into prior + posterior using monty_model_split, or if you are not using this within a Bayesian context and you want to use an alternative easy-to-sample-from reference distribution.

Value

A monty_sampler object, which can be used with monty_sample

Details

We implement the sampler based on https://doi.org/10.1111/rssb.12464

Efficiency of the sampler

A parallel tempering sampler runs a series of chains at the same time, so is doing much more work than a simpler sampler. If you run with n_rungs = 10 you are doing 11x more work than the underlying base sampler, so you want to make sure that this is paid back somewhere. There are a few places where this efficiency may come from:

Your model is parallelisable. If your underlying model can run very efficiently in parallel then it may not take much longer in "wall time" to run the extra copies of the calculations. In this case, you'll still be using much more CPU time but will be able to take advantage of extra cores to get more effective sampling if the parallel tempering sampler mixes better than the underlying sampler.
Your model is vectorised. If your model is implemented in R and vectorises the density calculations then it will generally not take much longer to compute many densities at once than a single one.
Your density is multimodal. If your density has distinct peaks, then most samplers will struggle to explore it well, and even with a non-parallelised, non-vectorised sampler the parallel tempering sampler will explore the space more efficiently. In the limit, a normal sampler may only explore a single peak in a model with many such peaks and will never mix properly.