Create a dust model from a C++ input file. This function will
compile the dust support around your model and return an object
that can be used to work with the model (see the Details below,
and dust_generator
).
dust(
filename,
quiet = FALSE,
workdir = NULL,
gpu = FALSE,
real_type = NULL,
linking_to = NULL,
cpp_std = NULL,
compiler_options = NULL,
optimisation_level = NULL,
skip_cache = FALSE
)
The path to a single C++ file
Logical, indicating if compilation messages from
pkgbuild
should be displayed. Error messages will be
displayed on compilation failure regardless of the value used.
Optional working directory to use. If NULL
uses a temporary directory. By using a different directory of
your choosing you can see the generated code.
Logical, indicating if we should generate GPU
code. This requires a considerable amount of additional software
installed (CUDA toolkit and drivers) as well as a
CUDA-compatible GPU. If TRUE
, then we call
dust_cuda_options with no arguments. Alternatively, call
that function and pass the value here (e.g, gpu = dust::dust_cuda_options(debug = TRUE)
). Note that due to the
use of the __syncwarp()
primitive this may require a GPU with
compute version 70 or higher.
Optionally, a string indicating a substitute type to
swap in for your model's real_type
declaration. If given, then we
replace the string using real_type = (double|float)
with the
given type. This is primarily intended to be used as gpu = TRUE, real_type = "float"
in order to create model for the GPU
that will use 32 bit floats
(rather than 64 bit doubles, which
are much slower). For CPU models decreasing precision of your
real type will typically just decrease precision for no
additional performance.
Optionally, a character vector of additional
packages to add to the DESCRIPTION
's LinkingTo
field. Use
this when your model pulls in C++ code that is packaged within
another package's header-only library.
The C++ standard to use, if you need to set one
explicitly. See the section "Using C++ code" in "Writing R
extensions" for the details of this, and how it interacts with
the R version currently being used. For R 4.0.0 and above, C++11
will be used; as dust depends on at least this version of R you
will never need to specify a version this low. Sensible options
are C++14
, C++17
, etc, depending on the features you need
and what your compiler supports.
A character vector of additional options
to pass through to the C++ compiler. These will be passed
through without any shell quoting or validation, so check the
generated commands and outputs carefully in case of error. Note
that R will apply these before anything in your personal
Makevars
.
A shorthand way of specifying common
compiler options that control optimisation level. By default
(NULL
) no options are generated from this, and the
optimisation level will depend on your user Makevars
file.
Valid options are none
which disables optimisation (-O0
),
which will be faster to compile but much slower, standard
which enables standard level of optimisation (-O2
), useful if
your Makevars/pkgload configuration is disabling optimisation,
or max
(-O3
and --fast-math
) which enables some
slower-to-compile and potentially unsafe
optimisations. These options are applied after
compiler_options
and may override options provided there.
Note that as for compiler_options
, R will apply these before
anything in your personal Makevars
Logical, indicating if the cache of previously
compiled models should be skipped. If TRUE
then your model will
not be looked for in the cache, nor will it be added to the
cache after compilation.
A dust_generator
object based on your source files
Your input dust model must satisfy a few requirements.
Define some class that implements your model (below model
is
assumed to be the class name)
That class must define a type internal_type
(so
model::internal_type
) that contains its internal data that the
model may change during execution (i.e., that is not shared
between particles). If no such data is needed, you can do
using internal_type = dust::no_internal;
to indicate this.
We also need a type shared_type
that contains constant internal
data is shared between particles (e.g., dimensions, arrays that
are read but not written). If no such data is needed, you can do
using share_type = dust::no_shared;
to indicate this.
That class must also include a type alias that describes the
model's floating point type, real_type
. Most models can include
using real_type = double;
in their public section.
The class must also include a type alias that describes the model's
data type. If your model does not support data, then write
using data_type = dust::no_data;
, which disables the
compare_data
and set_data
methods. Otherwise see
vignette("data")
for more information.
The class must have a constructor that accepts const dust::pars_type<model>& pars
for your type model
. This will have
elements shared
and internal
which you can assign into your
model if needed.
The model must have a method size()
returning size_t
which
returns the size of the system. This size may depend on values
in your initialisation object but is constant within a model
run.
The model must have a method initial
(which may not be
const
), taking a time step number (size_t
) and returning a
std::vector<real_type>
of initial state for the model.
The model must have a method update
taking arguments:
size_t time
: the time step number
const double * state
: the state at the beginning of the
time step
dust::rng_state_type<real_type>& rng_state
: the dust random number
generator state - this must be a reference, as it will be modified
as random numbers are drawn
double *state_next
: the end state of the model
(to be written to by your function)
Your update
function is the core here and should update the
state of the system - you are expected to update all values of
state
on return.
It is very important that none of the functions in the class use the R API in any way as these functions will be called in parallel.
You must also provide a data/parameter-wrangling function for
producing an object of type dust::pars_type<model>
from an R list. We
use cpp11 for this. Your function will look like:
namespace dust {
template <>
dust::pars_type<model> dust_pars<model>(cpp11::list pars) {
// ...
return dust::pars_type<model>(shared, internal);
}
}
With the body interacting with pars
to create an object of type
model::shared_type
and model::internal_type
before returning the
dust::pars_type
object. This function will be called in serial
and may use anything in the cpp11 API. All elements of the
returned object must be standard C/C++ (e.g., STL) types and
not cpp11/R types. If your model uses only shared or internal,
you may use the single-argument constructor overload to
dust::pars_type
which is equivalent to using dust::no_shared
or
dust::no_internal
for the missing argument.
Your model may provided a template specialisation
dust::dust_info<model>()
returning a cpp11::sexp
for
returning arbitrary information back to the R session:
namespace dust {
template <>
cpp11::sexp dust_info<model>(const dust::pars_type<sir>& pars) {
return cpp11::wrap(...);
}
}
What you do with this is up to you. If not present then the
info()
method on the created object will return NULL
.
Potential use cases for this are to return information about
variable ordering, or any processing done while accepting the
pars object used to create the pars fed into the particles.
You can optionally use C++ pseudo-attributes to configure the generated code. Currently we support two attributes:
[[dust::class(classname)]]
will tell dust the name of your
target C++ class (in this example classname
). You will need to
use this if your file uses more than a single class, as
otherwise will try to detect this using extremely simple
heuristics.
[[dust::name(modelname)]]
will tell dust the name to use for
the class in R code. For technical reasons this must be
alphanumeric characters only (sorry, no underscore) and must not
start with a number. If not included then the C++ type name will
be used (either specified with [[dust::class()]]
or detected).
Your model should only throw exceptions as a last resort. One such
last resort exists already if rbinom
is given invalid inputs
to prevent an infinite loop. If an error is thrown, all
particles will complete their current run, and then the error
will be rethrown - this is required by our parallel processing
design. Once this happens though the state of the system is
"inconsistent" as it contains particles that have run for
different lengths of time. You can extract the state of the
system at the point of failure (which may help with debugging)
but you will be unable to continue running the object until
either you reset it (with $update_state()
). An error will be
thrown otherwise.
Things are worse on a GPU; if an error is thrown by the RNG code
(happens in rbinom
when given impossible inputs such as
negative sizes, probabilities less than zero or greater than 1)
then we currently use CUDA's __trap()
function which will
require a process restart to be able to use anything that uses
the GPU again, covering all methods in the class. However, this
is preferable to the infinite loop that would otherwise be
caused.
dust_generator
for a description of the class
of created objects, and dust_example()
for some
pre-built examples. If you want to just generate the code and
load it yourself with pkgload::load_all
or some other means,
see dust_generate
)
# dust includes a couple of very simple examples
filename <- system.file("examples/walk.cpp", package = "dust")
# This model implements a random walk with a parameter coming from
# R representing the standard deviation of the walk
writeLines(readLines(filename))
#> class walk {
#> public:
#> using real_type = double;
#> using data_type = dust::no_data;
#> using internal_type = dust::no_internal;
#> using rng_state_type = dust::random::generator<real_type>;
#>
#> struct shared_type {
#> real_type sd;
#> bool random_initial;
#> };
#>
#> walk(const dust::pars_type<walk>& pars) : shared(pars.shared) {
#> }
#>
#> size_t size() const {
#> return 1;
#> }
#>
#> std::vector<real_type> initial(size_t time, rng_state_type& rng_state) {
#> std::vector<real_type> ret = {0};
#> if (shared->random_initial) {
#> ret[0] = dust::random::normal<real_type>(rng_state, 0, shared->sd);
#> }
#> return ret;
#> }
#>
#> void update(size_t time, const real_type * state, rng_state_type& rng_state,
#> real_type * state_next) {
#> state_next[0] = state[0] +
#> dust::random::normal<real_type>(rng_state, 0, shared->sd);
#> }
#>
#> private:
#> dust::shared_ptr<walk> shared;
#> };
#>
#> namespace dust {
#>
#> template <>
#> dust::pars_type<walk> dust_pars<walk>(cpp11::list pars) {
#> walk::real_type sd = cpp11::as_cpp<walk::real_type>(pars["sd"]);
#> const bool random_initial = pars["random_initial"] == R_NilValue ? false :
#> cpp11::as_cpp<bool>(pars["random_initial"]);
#> return dust::pars_type<walk>(walk::shared_type{sd, random_initial});
#> }
#>
#> }
# The model can be compiled and loaded with dust::dust(filename)
# but it's faster in this example to use the prebuilt version in
# the package
model <- dust::dust_example("walk")
# Print the object and you can see the methods that it provides
model
#> <dust> object generator
#> Public:
#> initialize: function (pars, time, n_particles, n_threads = 1L, seed = NULL,
#> name: function ()
#> param: function ()
#> run: function (time_end)
#> simulate: function (time_end)
#> run_adjoint: function ()
#> set_index: function (index)
#> index: function ()
#> ode_control: function ()
#> ode_statistics: function ()
#> n_threads: function ()
#> n_state: function ()
#> n_particles: function ()
#> n_particles_each: function ()
#> shape: function ()
#> update_state: function (pars = NULL, state = NULL, time = NULL, set_initial_state = NULL,
#> state: function (index = NULL)
#> time: function ()
#> set_stochastic_schedule: function (time)
#> reorder: function (index)
#> resample: function (weights)
#> info: function ()
#> pars: function ()
#> rng_state: function (first_only = FALSE, last_only = FALSE)
#> set_rng_state: function (rng_state)
#> has_openmp: function ()
#> has_gpu_support: function (fake_gpu = FALSE)
#> has_compare: function ()
#> real_size: function ()
#> time_type: function ()
#> rng_algorithm: function ()
#> uses_gpu: function (fake_gpu = FALSE)
#> n_pars: function ()
#> set_n_threads: function (n_threads)
#> set_data: function (data, shared = FALSE)
#> compare_data: function ()
#> filter: function (time_end = NULL, save_trajectories = FALSE, time_snapshot = NULL,
#> gpu_info: function ()
#> Private:
#> pars_: NULL
#> pars_multi_: NULL
#> index_: NULL
#> info_: NULL
#> n_threads_: NULL
#> n_particles_: NULL
#> n_particles_each_: NULL
#> shape_: NULL
#> ptr_: NULL
#> gpu_config_: NULL
#> ode_control_: NULL
#> methods_: NULL
#> param_: NULL
#> reload_: NULL
#> Parent env: <environment: namespace:dust>
#> Locked objects: TRUE
#> Locked class: FALSE
#> Portable: TRUE
# Create a model with standard deviation of 1, initial time step zero
# and 30 particles
obj <- model$new(list(sd = 1), 0, 30)
obj
#> <dust>
#> Public:
#> compare_data: function ()
#> filter: function (time_end = NULL, save_trajectories = FALSE, time_snapshot = NULL,
#> gpu_info: function ()
#> has_compare: function ()
#> has_gpu_support: function (fake_gpu = FALSE)
#> has_openmp: function ()
#> index: function ()
#> info: function ()
#> initialize: function (pars, time, n_particles, n_threads = 1L, seed = NULL,
#> n_pars: function ()
#> n_particles: function ()
#> n_particles_each: function ()
#> n_state: function ()
#> n_threads: function ()
#> name: function ()
#> ode_control: function ()
#> ode_statistics: function ()
#> param: function ()
#> pars: function ()
#> real_size: function ()
#> reorder: function (index)
#> resample: function (weights)
#> rng_algorithm: function ()
#> rng_state: function (first_only = FALSE, last_only = FALSE)
#> run: function (time_end)
#> run_adjoint: function ()
#> set_data: function (data, shared = FALSE)
#> set_index: function (index)
#> set_n_threads: function (n_threads)
#> set_rng_state: function (rng_state)
#> set_stochastic_schedule: function (time)
#> shape: function ()
#> simulate: function (time_end)
#> state: function (index = NULL)
#> time: function ()
#> time_type: function ()
#> update_state: function (pars = NULL, state = NULL, time = NULL, set_initial_state = NULL,
#> uses_gpu: function (fake_gpu = FALSE)
#> Private:
#> gpu_config_: NULL
#> index_: NULL
#> info_: NULL
#> methods_: list
#> n_particles_: 30
#> n_particles_each_: 30
#> n_threads_: 1
#> ode_control_: NULL
#> param_: NULL
#> pars_: list
#> pars_multi_: FALSE
#> ptr_: externalptr
#> reload_: NULL
#> shape_: 30
# Curent state is all zero
obj$state()
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
#> [1,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0
#> [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26]
#> [1,] 0 0 0 0 0 0 0 0 0 0 0 0
#> [,27] [,28] [,29] [,30]
#> [1,] 0 0 0 0
# Current time is also zero
obj$time()
#> [1] 0
# Run the model up to time step 100
obj$run(100)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#> [1,] 1.562305 7.288367 5.84155 0.08509074 -7.07042 10.60426 -2.057119 -9.614804
#> [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16]
#> [1,] -8.294164 9.07455 19.80548 1.293768 -13.24191 -1.231157 8.591563 5.342956
#> [,17] [,18] [,19] [,20] [,21] [,22] [,23]
#> [1,] -9.857717 -8.946853 18.32985 7.39139 -1.159903 -15.64287 -15.16489
#> [,24] [,25] [,26] [,27] [,28] [,29] [,30]
#> [1,] 3.454845 -2.186743 7.231447 14.71851 -6.099696 -0.4988606 -0.6123435
# Reorder/resample the particles:
obj$reorder(sample(30, replace = TRUE))
# See the state again
obj$state()
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#> [1,] -15.16489 1.293768 -7.07042 10.60426 8.591563 8.591563 0.08509074 10.60426
#> [,9] [,10] [,11] [,12] [,13] [,14] [,15]
#> [1,] -8.294164 -7.07042 -7.07042 3.454845 8.591563 -13.24191 7.288367
#> [,16] [,17] [,18] [,19] [,20] [,21] [,22]
#> [1,] -0.6123435 -15.16489 19.80548 -0.6123435 19.80548 -7.07042 -1.159903
#> [,23] [,24] [,25] [,26] [,27] [,28] [,29] [,30]
#> [1,] 1.293768 -0.4988606 7.288367 10.60426 0.08509074 5.342956 7.288367 5.84155