The dust random number generators are suitable for use from other packages and we provide a few helpers in both R and C++ to make this easier.

For illustration purposes we will assume that you want to estimate pi
using rejection sampling. The basic idea of this algorithm is simple;
sample two U(0, 1) points `x`

and `y`

and test if
they lie in the unit circle (i.e. `sqrt(x^2 + x^2) < 1`

)
giving the ratio of the area of a unit circle to a square. Multiplying
the fraction of the points that we accept by 4 yields our estimate of
pi.

First, an example that uses R’s API for example (note that while the R API is C, we’re using the cpp11 package interface here so that the following examples are similar):

```
#include <cpp11.hpp>
#include <R_ext/Random.h>
[[cpp11::register]]
double pi_r(int n) {
int tot = 0;
();
GetRNGstatefor (int i = 0; i < n; ++i) {
const double u1 = unif_rand();
const double u2 = unif_rand();
if (u1 * u1 + u2 * u2 < 1) {
++;
tot}
}
();
PutRNGstatereturn tot / static_cast<double>(n) * 4.0;
}
```

With cpp11 we can load this with `cpp11::cpp_source`

`cpp11::cpp_source("rng_pi_r.cpp")`

and then run it with

```
pi_r(1e6)
#> [1] 3.139504
```

The key bits within the code above are that we:

- included random number support from R via the
`R_ext/Random.h`

header - initialised the state of the global random number stream within our
C++ context with
`GetRNGstate`

before drawing any random numbers - drew some possible large number of numbers using
`unif_rand`

- restored the random number state back to R.

Failure to run the `GetRNGstate`

/
`PutRNGstate`

will result in the stream not behaving
properly. This is explained in detail in the “Writing R Extensions”
manual.

The implementation in dust will look very similar to above, but the
way that we cope with the random number stream will look quite
different. With the R version above we are looking after R’s global
stream (stored in the variable `.Random.seed`

) and making
sure that it is fetched and set on entry and exit to the function.

One of the design ideas in dust is that there is no single global source of random numbers, so we need to create a source that our function can use. If we were to use the simulation function multiple times we would want the stream to pick up where it left off last time, so the act of calling the function should update the seed as a “side effect”.

The way we expose this for use within other packages is that the user (either package developer or user of the package) creates a “pointer” to some random number state. Passing that state into a C++ function will allow use of the random functions within dust, and will update the state correctly (see the following section for details).

First we create a pointer object:

```
rng <- dust:::dust_rng_pointer$new()
rng
#> <dust_rng_pointer>
#> Public:
#> algorithm: xoshiro256plus
#> initialize: function (seed = NULL, n_streams = 1L, long_jump = 0L, algorithm = "xoshiro256plus")
#> is_current: function ()
#> n_streams: 1
#> state: function ()
#> sync: function ()
#> Private:
#> is_current_: TRUE
#> ptr_: externalptr
#> state_: 2f d6 21 97 9b 4f 43 00 59 87 46 42 d6 ff e5 e4 e8 de d7 ...
```

Unlike the `dust::dust_rng`

object there are no real
useful methods on this object and from the R side we’ll treat it as a
black box. Importantly the `rng`

object knows which algorithm
it has been created to use

```
rng$algorithm
#> [1] "xoshiro256plus"
```

The default will be suitable for most purposes.

We can rewrite the pi approximation function as:

```
#include <cpp11.hpp>
#include <dust/r/random.hpp>
[[cpp11::linking_to(dust)]]
[[cpp11::register]]
double pi_dust(int n, cpp11::sexp ptr) {
auto rng =
::random::r::rng_pointer_get<dust::random::xoshiro256plus>(ptr);
dustauto& state = rng->state(0);
int tot = 0;
for (int i = 0; i < n; ++i) {
const double u1 = dust::random::random_real<double>(state);
const double u2 = dust::random::random_real<double>(state);
if (u1 * u1 + u2 * u2 < 1) {
++;
tot}
}
return tot / static_cast<double>(n) * 4.0;
}
```

This snippet looks much the same as above:

- We’ve added
`[[cpp::linking_to(dust)]]`

and included the dust random interface (`dust/r/random.hpp`

) - The first line of the function safely creates a pointer to the
random state data. The template argument here
(
`<dust::random::xoshiro256plus>`

) refers to the rng algorithm and matches`rng$algorithm`

- The second line extracts a reference to the first (C++ indexing
starting at 0) random number stream - this pair of lines is roughly
equivalent to
`GetRNGstate()`

except that that the random numbers do not come from some global source - After that the listing proceeds as before proceeds as before, except
there is no equivalent to
`PutRNGstate()`

because the pointer object takes care of this automatically.

`cpp11::cpp_source("rng_pi_dust.cpp")`

and then run it with

```
pi_dust(1e6, rng)
#> [1] 3.14206
```

The C++ interface is described in more detail in the online documentation

Part of the point of dust’s random number generators is that they create independent streams of random numbers that can be safely used in parallel.

```
#include <cpp11.hpp>
#include <dust/r/random.hpp>
#ifdef _OPENMP
#include <omp.h>
#endif
[[cpp11::linking_to(dust)]]
[[cpp11::register]]
double pi_dust_parallel(int n, cpp11::sexp ptr, int n_threads) {
auto rng =
::random::r::rng_pointer_get<dust::random::xoshiro256plus>(ptr);
dustconst auto n_streams = rng->size();
int tot = 0;
#ifdef _OPENMP
#pragma omp parallel for schedule(static) num_threads(n_threads) \
reduction(+:tot)
#endif
for (size_t i = 0; i < n_streams; ++i) {
auto& state = rng->state(0);
int tot_i = 0;
for (int i = 0; i < n; ++i) {
const double u1 = dust::random::random_real<double>(state);
const double u2 = dust::random::random_real<double>(state);
if (u1 * u1 + u2 * u2 < 1) {
++;
tot_i}
}
+= tot_i;
tot }
return tot / static_cast<double>(n * n_streams) * 4.0;
}
```

`cpp11::cpp_source("rng_pi_parallel.cpp")`

Here we’ve made a number of decisions about how to split the problem up subject to a few constraints about using OpenMP together with R:

- Generally speaking we want the answer to be independent of the
number of threads used to run it, as this will vary in different
sessions. As such avoid the temptation to do a loop over the threads;
here we instead iterate over
*streams*with the idea that there will be one or more streams used per threads. If we ran a single thread we’d get the same answer as if we ran one thread per stream. - Each thread is going to do it’s own loop of length
`n`

so we need to divide by`n * n_streams`

at the end as that’s many attempts we have made. - We use OpenMP’s
`reduction`

clause to safely accumulate the different subtotals (the`tot_i`

values) into one`tot`

value. - In order to compile gracefully on machines that do not have OpenMP
support both the
`#include <omp.h>`

line and the`#pragma omp`

line are wrapped in guards that test for`_OPENMP`

(see “Writing R Extensions”). - We let the generator tell us how many streams it has
(
`n_streams = rng->size()`

) but we could as easily specify an ideal number of streams as an argument here and then test that the generator has*at least that many*by adding an argument to the call to`rng_pointer_get`

(e.g., if we wanted`m`

streams the call would be`rng_pointer_get<type>(ptr, m)`

)

```
rng <- dust:::dust_rng_pointer$new(n_streams = 20)
pi_dust_parallel(1e6, rng, 4)
#> [1] 3.140963
```

Unfortunately `cpp11::cpp_source`

does not support using OpenMP so in the example above the code will
run in serial and we can’t see if parallelisation will help.

In order to compile with support, we need to build a little package
and set up an appropriate `Makevars`

file

The package is fairly minimal:

```
#> .
#> ├── DESCRIPTION
#> ├── NAMESPACE
#> └── src
#> ├── Makevars
#> └── code.cpp
```

We have an extremely minimal `DESCRIPTION`

, which contains
line `LinkingTo: cpp11, dust`

from which R will arrange
compiler flags to find both packages’ headers:

```
Package: piparallel
LinkingTo: cpp11, dust
Version: 0.0.1
SystemRequirements: C++11
```

The `NAMESPACE`

loads the dynamic library

```
useDynLib("piparallel", .registration = TRUE)
exportPattern("^[[:alpha:]]+")
```

The `src/Makevars`

file contains important flags to pick
up OpenMP support:

```
PKG_CXXFLAGS=-DHAVE_INLINE $(SHLIB_OPENMP_CXXFLAGS)
PKG_LIBS=$(SHLIB_OPENMP_CXXFLAGS)
```

And `src/code.cpp`

contains the file above but without the
`[[cpp11::linking_to(dust)]]`

line:

```
#include <cpp11.hpp>
#include <dust/r/random.hpp>
#ifdef _OPENMP
#include <omp.h>
#endif
[[cpp11::register]]
double pi_dust_parallel(int n, cpp11::sexp ptr, int n_threads) {
auto rng =
::random::r::rng_pointer_get<dust::random::xoshiro256plus>(ptr);
dustconst auto n_streams = rng->size();
int tot = 0;
#ifdef _OPENMP
#pragma omp parallel for schedule(static) num_threads(n_threads) \
reduction(+:tot)
#endif
for (size_t i = 0; i < n_streams; ++i) {
auto& state = rng->state(0);
int tot_i = 0;
for (int i = 0; i < n; ++i) {
const double u1 = dust::random::random_real<double>(state);
const double u2 = dust::random::random_real<double>(state);
if (u1 * u1 + u2 * u2 < 1) {
++;
tot_i}
}
+= tot_i;
tot }
return tot / static_cast<double>(n * n_streams) * 4.0;
}
```

After compiling and installing the package,
`pi_dust_parallel`

will be available

Now we have a parallel version we can see a speed-up as we add threads:

```
rng <- dust:::dust_rng_pointer$new(n_streams = 20)
bench::mark(
pi_dust_parallel(1e6, rng, 1),
pi_dust_parallel(1e6, rng, 2),
pi_dust_parallel(1e6, rng, 3),
pi_dust_parallel(1e6, rng, 4),
check = FALSE)
#> # A tibble: 4 x 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 pi_dust_parallel(1e+06, rng, 1) 46.4ms 47.2ms 21.2 0B 0
#> 2 pi_dust_parallel(1e+06, rng, 2) 23.1ms 23.8ms 42.1 0B 0
#> 3 pi_dust_parallel(1e+06, rng, 3) 16.1ms 16.2ms 60.6 0B 0
#> 4 pi_dust_parallel(1e+06, rng, 4) 11.6ms 12.5ms 73.6 0B 0
```

This section aims to de-mystify the pointer objects a little. The
dust random number state is a series of integers (by default 64-bit
unsigned integers) that are updated each time a state is drawn (see
`vignette("rng.Rmd")`

). We expose this state to R as a vector
of “raw” values (literally a series of bytes of data).

```
rng <- dust::dust_rng$new(seed = 1)
rng$state()
#> [1] c1 5c 02 89 ec 2d 0a 91 1e 61 39 74 08 ab 41 5e c3 86 8d 6d f4 02 8a b1 56
#> [26] 49 ee d9 dd 95 81 e2
```

When numbers are drawn from the stream, the state is modified as a side-effect:

```
rng$random_real(20)
#> [1] 0.451351392 0.073534657 0.218129406 0.200139527 0.017172743 0.323288520
#> [7] 0.338823899 0.862898581 0.005702738 0.006453255 0.844823829 0.222676329
#> [13] 0.665130023 0.283016916 0.502155564 0.290045309 0.055841255 0.083163285
#> [19] 0.391729373 0.744045249
rng$state()
#> [1] 2c ed 49 d0 8f 7e 97 2c 2a 5e 9e a1 78 85 47 80 06 d5 05 38 a3 5b 08 12 59
#> [26] 12 03 5c b3 ea 87 c8
```

The same happens with our `dust_rng_pointer`

objects used
above:

```
ptr <- dust::dust_rng_pointer$new(seed = 1)
ptr$state()
#> [1] c1 5c 02 89 ec 2d 0a 91 1e 61 39 74 08 ab 41 5e c3 86 8d 6d f4 02 8a b1 56
#> [26] 49 ee d9 dd 95 81 e2
```

Note that `ptr`

starts with the same state here as
`rng`

did because we started from the same seed. When we draw
20 numbers from the stream (by drawing 10 pairs of numbers with our
pi-estimation algorithm), we will advance the state

```
pi_dust(10, ptr)
#> [1] 4
ptr$state()
#> [1] 2c ed 49 d0 8f 7e 97 2c 2a 5e 9e a1 78 85 47 80 06 d5 05 38 a3 5b 08 12 59
#> [26] 12 03 5c b3 ea 87 c8
```

Note that the state here now matches the value returned by
`rng`

.

Normally nobody needs to know this - treat the pointer as an object that you pass to functions and ignore the details.