Control for the pmcmc. This function constructs a list of options
and does some basic validation to ensure that the options will
work well together. Do not manually change the values in this
object. Do not refer to any argument except n_steps
by position
as the order of the arguments may change in future.
pmcmc_control(
n_steps,
n_chains = 1L,
n_threads_total = NULL,
n_workers = 1L,
rerun_every = Inf,
rerun_random = FALSE,
use_parallel_seed = FALSE,
save_state = TRUE,
save_restart = NULL,
save_trajectories = FALSE,
progress = FALSE,
nested_step_ratio = 1,
nested_update_both = FALSE,
filter_early_exit = FALSE,
restart_match = FALSE,
n_burnin = NULL,
n_steps_retain = NULL,
adaptive_proposal = NULL,
path = NULL
)
Number of MCMC steps to run. This is the only required argument.
Optional integer, indicating the number of chains
to run. If more than one then we run a series of chains and
merge them with pmcmc_combine()
. Chains are run in series,
with the same filter if n_workers
is 1, or run in parallel
otherwise.
The total number of threads (i.e., cores)
the total number of threads/cores to use. If n_workers
is
greater than 1 then these threads will be divided evenly across
your workers at first and so n_threads_total
must be an even
multiple of n_workers
. If n_chains
is not a clean multiple
of n_workers
we will try and allocate the leftover threads
evenly across the last wave of chains. This value must be
provided if n_workers
is given, but is optional otherwise - if
given it overrides the value in the particle filter.
Number of "worker" processes to use to run chains
in parallel. This must be at most n_chains
and is recommended
to be a divisor of n_chains
. If n_workers
is 1, then chains
are run in series (i.e., one chain after the other). See the
parallel vignette (vignette("parallelisation", package = "mcstate")
) for more details about this approach.
Optional integer giving the frequency at which
we should rerun the particle filter on the current "accepted"
state. The default for this (Inf
) will never rerun this
point, but if you set to 100, then every 100 steps we run the
particle filter on both the proposed and previously accepted
point before doing the comparison. This may help "unstick"
chains, at the cost of some bias in the results.
Logical, controlling the behaviour of
rerunning (when rerun_every
is finite). The default value of
FALSE
will rerun the filter deterministically at a fixed
number of iterations (given by rerun_every
). If TRUE
, then
we stochastically rerun each step with probability of 1 / rerun_every
. This gives the same expected number of MCMC steps
between reruns but a different pattern.
Logical, indicating if seeds should be
configured in the same way as when running workers in parallel
(with n_workers > 1
). Set this to TRUE
to ensure
reproducibility if you use this option sometimes (but not
always). This option only has an effect if n_workers
is 1.
Logical, indicating if the state should be saved
at the end of the simulation. If TRUE
, then a single
randomly selected particle's state will be collected at the end
of each MCMC step. This is the full state (i.e., unaffected by
and index
used in the particle filter) so that the
process may be restarted from this point for projections. If
save_trajectories
is TRUE
the same particle will
be selected for each. The default is TRUE
, which will
cause n_state
* n_steps
of data to be output
alongside your results. Set this argument to FALSE
to
save space, or use pmcmc_thin()
after running the
MCMC.
An integer vector of time points to save
restart information for; this is in addition to save_state
(which saves the final model state) and saves the full model
state. It will use the same trajectory as save_state
and
save_trajectories
. Note that if you use this option you will
end up with lots of model states and will need to process them
in order to actually restart the pmcmc or the particle filter
from this state. The integers correspond to the time variable
in your filter (see particle_filter for more
information).
Logical, indicating if the particle
trajectories should be saved during the simulation. If TRUE
,
then a single randomly selected particle's trajectory will be
collected at the end of each MCMC step. This is the filtered
state (i.e., using the state
component of index
provided to
the particle filter). If save_state
is TRUE
the same
particle will be selected for each.
Logical, indicating if a progress bar should be
displayed, using progress::progress_bar
.
Either integer or 1/integer, which specifies the
ratio of fixed:varied steps in a nested pMCMC. For example 3
would run
3 steps proposing fixed parameters only and then 1 step proposing varied
parameters only; whereas 1/3
would run 3 varied steps
for every 1 fixed step. The default value of 1
runs an equal number of
iterations updating the fixed and varied parameters. Sensible choices
of this parameter may depend on the true ratio of fixed:varied parameters
or on desired run-time, for example updating fixed parameters is
quicker so more varied steps could be more efficient.
If FALSE
(default) then alternates
between proposing fixed and varied parameter updates according
to the ratio in nested_step_ratio
. If TRUE
then proposes
fixed and varied parameters simultaneously and collectively
accepts/rejects them, nested_step_ratio
is ignored.
Logical, indicating if we should allow the particle filter to exit early for points that will not be accepted. Only use this if your log-likelihood never increases between steps. This will the the case where your likelihood calculation is a sum of discrete normalised probability distributions, but may not be for continuous distributions!
Logical, indicating whether the restart state saved from the particle filter should match the trajectory saved, otherwise the restart state will be randomly drawn from the states of the particle filter after filtering to the restart time point.
Optionally, the number of points to discard as burnin. This happens separately to the burnin in pmcmc_thin or pmcmc_sample. See Details.
Optionally, the number of samples to retain from
the n_steps - n_burnin
steps. See Details.
Optionally, control over an adaptive
proposal (adaptive_proposal_control). Alternatively
FALSE
to disable, TRUE
to enable defaults. This is only
valid for single-population deterministic models.
Optional path to save partial pmcmc results in, when
using workers. If not given (or NULL
) then a temporary
directory is used.
A pmcmc_control
object, which should not be modified
once created.
pMCMC is slow and you will want to parallelise it if you possibly
can. There are two ways of doing this which are discussed in some
detail in vignette("parallelisation", package = "mcstate")
.
Generally it may be preferable to thin the chains after generation using pmcmc_thin or pmcmc_sample. However, waiting that long can create memory consumption issues because the size of the trajectories can be very large. To avoid this, you can thin the chains at generation - this will avoid creating large trajectory arrays, but will discard some information irretrivably.
If either of the options n_burnin
or n_steps_retain
are provided,
then we will subsample the chain at generation.
If n_burnin
is provided, then the first n_burnin
(of
n_steps
) samples is discarded. This must be at most n_steps
If n_steps_retain
is provided, then we evenly sample out of
the remaining samples. The algorithm will try and generate a
sensible set here, and will always include the last sample of
n_steps
but may not always include the first post-burnin
sample. An error will be thrown if a suitable sampling is not
possible (e.g., if n_steps_retain
is larger than n_steps - n_burnin
If either of n_burnin
or n_steps_retain
is provided, the
resulting samples object will include the full set of parameters
and probabilities sampled, along with an index showing how they
relate to the filtered samples.
mcstate::pmcmc_control(1000)
#> $n_steps
#> [1] 1000
#>
#> $n_chains
#> [1] 1
#>
#> $n_workers
#> [1] 1
#>
#> $n_threads_total
#> NULL
#>
#> $rerun_every
#> [1] Inf
#>
#> $rerun_random
#> [1] FALSE
#>
#> $use_parallel_seed
#> [1] FALSE
#>
#> $save_state
#> [1] TRUE
#>
#> $save_restart
#> NULL
#>
#> $save_trajectories
#> [1] FALSE
#>
#> $progress
#> [1] FALSE
#>
#> $progress_simple
#> [1] FALSE
#>
#> $path
#> NULL
#>
#> $adaptive_proposal
#> NULL
#>
#> $filter_early_exit
#> [1] FALSE
#>
#> $restart_match
#> [1] FALSE
#>
#> $nested_update_both
#> [1] FALSE
#>
#> $nested_step_ratio
#> [1] 1
#>
#> $n_burnin
#> [1] 0
#>
#> $n_steps_retain
#> [1] 1000
#>
#> $n_steps_every
#> [1] 1
#>
#> attr(,"class")
#> [1] "pmcmc_control"
# Suppose we have a fairly large node with 16 cores and we want to
# run 8 chains. We can use all cores for a single chain and run
# the chains sequentially like this:
mcstate::pmcmc_control(1000, n_chains = 8, n_threads_total = 16)
#> $n_steps
#> [1] 1000
#>
#> $n_chains
#> [1] 8
#>
#> $n_workers
#> [1] 1
#>
#> $n_threads_total
#> [1] 16
#>
#> $rerun_every
#> [1] Inf
#>
#> $rerun_random
#> [1] FALSE
#>
#> $use_parallel_seed
#> [1] FALSE
#>
#> $save_state
#> [1] TRUE
#>
#> $save_restart
#> NULL
#>
#> $save_trajectories
#> [1] FALSE
#>
#> $progress
#> [1] FALSE
#>
#> $progress_simple
#> [1] FALSE
#>
#> $path
#> NULL
#>
#> $adaptive_proposal
#> NULL
#>
#> $filter_early_exit
#> [1] FALSE
#>
#> $restart_match
#> [1] FALSE
#>
#> $nested_update_both
#> [1] FALSE
#>
#> $nested_step_ratio
#> [1] 1
#>
#> $n_burnin
#> [1] 0
#>
#> $n_steps_retain
#> [1] 1000
#>
#> $n_steps_every
#> [1] 1
#>
#> attr(,"class")
#> [1] "pmcmc_control"
# However, on some platforms (e.g., Windows) this may only realise
# a 50% total CPU use, in which case you might benefit from
# splitting these chains over different worker processes (2-4
# workers is likely the largest useful number).
mcstate::pmcmc_control(1000, n_chains = 8, n_threads_total = 16,
n_workers = 4)
#> $n_steps
#> [1] 1000
#>
#> $n_chains
#> [1] 8
#>
#> $n_workers
#> [1] 4
#>
#> $n_threads_total
#> [1] 16
#>
#> $rerun_every
#> [1] Inf
#>
#> $rerun_random
#> [1] FALSE
#>
#> $use_parallel_seed
#> [1] FALSE
#>
#> $save_state
#> [1] TRUE
#>
#> $save_restart
#> NULL
#>
#> $save_trajectories
#> [1] FALSE
#>
#> $progress
#> [1] FALSE
#>
#> $progress_simple
#> [1] FALSE
#>
#> $path
#> NULL
#>
#> $adaptive_proposal
#> NULL
#>
#> $filter_early_exit
#> [1] FALSE
#>
#> $restart_match
#> [1] FALSE
#>
#> $nested_update_both
#> [1] FALSE
#>
#> $nested_step_ratio
#> [1] 1
#>
#> $n_burnin
#> [1] 0
#>
#> $n_steps_retain
#> [1] 1000
#>
#> $n_steps_every
#> [1] 1
#>
#> attr(,"class")
#> [1] "pmcmc_control"