odin functions

odin2 supports many functions that you’d expect to see for constructing dynamical models. These include most common mathematical operations and some that are quite obscure. The support for stochastic models and comparison to data comes from monty.

Basic operations

+ – Plus: Both infix (a + b) and prefix (+a) versions supported (e.g., 1 + 2 → 3)
- – Minus: Both infix (a - b) and prefix (-a) versions supported (e.g., 10 - 1 → 9)
* – Multiply: Multiply two numbers together (e.g., 2 * 6 → 12)
/ – Divide: Divide two numbers (e.g., 12 / 6 → 2)
^ – Power: Raise the first number to the power of the second. Either number may be a floating point number (e.g., 2.3 ^ 1.2 → 2.716898)
( – Parenthesis: Group expressions together (e.g., (1 + 5) * 2 → 12)

Conditionals

You can only use conditionals with if as an inline expression, for example

a <- if (9 > 10) 1 else 2

would result in a being assigned as 2 (this works in R normally too!).

Because general flow control is not supported, you cannot write:

if (9 > 10) {
  a <- 1
} else {
  a <- 2
}

Operators

A number of logical-returning operators exist, primarily to support the if statement; all the usual comparison operators exist (though not vectorised | or &).

> – Greater than (e.g., 1 > 2 → FALSE)
< – Less than (e.g., 1 < 2 → TRUE)
>= – Greater than or equal to (e.g., 1 >= 2 → FALSE)
<= – Less than or equal to (e.g., 1 <= 2 → TRUE)
== – Is exactly equal to (e.g., 1 == 1 → TRUE)
!= – Is not exactly equal to (e.g., 1 != 2 → TRUE)
&& – Boolean AND (e.g., (1 == 1) && (2 > 1) → TRUE)
|| – Boolean OR (e.g., (1 == 1) && (2 > 1) → TRUE)

Be wary of strict equality with == or != as numbers may be floating point numbers, which have some surprising properties for the uninitiated, for example

sqrt(3)^2 == 3

## [1] FALSE

Mathematical functions

%% – Modulo: Finds the remainder after division of one number by another (e.g., 123 %% 100 → 23)
%/% – Integer divide: Different to floating point division, effectively the full number of times one number divides into another (e.g., 20 %/% 7 → 2)
abs – Absolute value (e.g., abs(-1) → 1)
sign – Sign function: Returns the sign of its argument as either -1, 0 or 1, which may be useful for multiplying by another argument (e.g., sign(-100) → -1)
round – Round a number (e.g., round(1.23) → 1; round(1.23, 1) → 1.2)
floor – Floor of a number: Largest integer not greater than the provided number (e.g., floor(6.5) → 6)
ceiling – Ceiling of a number: Smallest integer not less than the provided number (e.g., ceiling(6.5) → 7)
trunc – Truncate a number: Round a number towards zero
max – Maximum: Returns maximum of two numbers (e.g., max(2, 6) → 6)
min – Minimum (e.g., min(2, 6) → 2)
exp – Exponential function (e.g., exp(1) → 2.718282)
expm1 – Computes exp(x) - 1 accurately for small |x| (e.g., exp(1) → 1.718282)
log – Logarithmic function in base e (e.g., log(1) → 0)
log2 – Logarithmic function in base 2 (e.g., log2(1024) → 10)
log10 – Logarithmic function in base 10 (e.g., log10(1000) → 3)
log1p – Computes log(x + 1) accurately for small |x| (e.g., log1p(1) → 0.6931472)
sqrt – Square root function (e.g., sqrt(4) → 2)
beta – Beta function (e.g., beta(3, 5) → 0.00952381)
lbeta – Log beta function (e.g., lbeta(3, 5) → -4.65396)
choose – Binomial coefficients (e.g., choose(60, 3) → 34220)
lchoose – Log binomial coefficients (e.g., choose(60, 3) → 10.44057)
gamma – Gamma function (e.g., gamma(10) → 362880)
lgamma – Log gamma function (e.g., lgamma(10) → 12.80183)

The exact behaviour for %% and %/% for floating point numbers and negative numbers is complicated - please see ?Arithmetic. The rules for operators in odin try to follow those in R as closely as possible.

All the usual trig functions are also available:

cos – Cosine function
sin – Sine function
tan – Tangent function
acos – Arc-cosine function
asin – Arc-sin function
atan – Arc-tangent function
atan2 – Two-argument arc-tangent function
cosh – Hyperbolic cosine function
sinh – Hyperbolic sine function
tanh – Hyperbolic tangent function
acosh – Hyperbolic arc-cosine function
asinh – Hyperbolic arc-sine function
atanh – Hyperbolic arc-tangent function

Arrays

Use of arrays implies a “for-loop” in the generated code. For example, you might write a vectorised version of the logistic map as:

update(y[]) <- r[i] * y[i] * (1 - y[i])

which will expand to code equivalent to

for (i in 1:length(y)) {
  y_next[i] <- r[i] * y[i] * (1 - y[i])
}

The loop extent here (over the entire range of y) is determined by the left hand side expression (y[]). This enables use of i on the right hand side to index as loop progresses.

The indices on the right hand side can be i, j, k, l, i5, i6, i7 or i8 (odin supports arrays up to 8 dimensions: do let us know if you need more for some reason).

Arrays can have more than one dimension, for example the expression:

ay[, ] <- a[i, j] * y[j]

involves loops over two dimensions because we loop over the whole extent of ay which is a matrix. This is roughly equivalent to:

for (i in 1:nrow(ay)) {
  for (j in 1:ncol(ay)) {
     ay[i, j] <- a[i, j] * y[j]
  }
}

Note here that y is accessed using j, even though it is only a vector. This is because the loop extents are generated by the left hand side.

Array size

Every array variable requires a dim() assignment. For example, in the above examples we might have:

dim(y) <- 10
dim(ay) <- c(nr, nc)

where y is defined to be a 1-dimensional array of length 10 and ay is a matrix (2-dimensional array) with nr rows and nc columns. The extents of arrays must be determined at the first system initialisation, and this is checked during parse.

Special functions for arrays

We provide several functions for retrieving dimensions from an array

length – Length: get the full length of an array. For a single dimensional array this is obvious, for a multidimensional array it is the product over all dimensions.
nrow – Number of rows: number of rows in a matrix or number of elements in the first dimension of a multidimensional array
ncol – Number of columns: number of columns in a matrix or number of elements in the second dimension of a multidimensional array

We do not currently offer a function for accessing the size of higher dimensions, please let us know if this is an issue (see vignette("migrating"))

Frequently, you will want to take a sum over an array, or part of an array, using sum. To sum over all elements of an array, use sum() with the name of the array you would like to sum over:

dim(x) <- 10
x_tot <- sum(x)

If m is a matrix you can compute the sums over the second column by writing:

m1_tot <- sum(m[, 2])

This partial sum approach is frequently used within implicit loops:

m_col_totals[] <- sum([, i])

You can use this approach to compute a matrix-vector product $\mathbf(Ax)$ :

ax_tmp[, ] <- a[i, j] * x[j]
ax[] <- sum(a[, i])

Distribution functions

We support distribution functions in two places:

First, for discrete-time models we support sampling from a distribution at each time step. For example:

a <- Normal(0, 1)

will assign a to a draw from the standard normal distribution. You cannot use these functions from continuous time models. You cannot sample from stochastic functions in a continuous time (ODE) model.

Second, for comparison to data, for example:

a ~ Normal(0, 1)

will add a log likelihood term looking up the log density of the data element a against a standard normal distribution. This form can be used in both discrete-time and continuous-time models. For more information, see vignette("fitting").

Some distributions have several versions; these are distinguished by the arguments to the functions. For example:

a <- Gamma(2, 0.1)
a <- Gamma(shape = 2, rate = 0.1)

draw from a Gamma distribution with a shape of 2 and a rate of 0.1, while

a <- Gamma(2, scale = 10)
a <- Gamma(shape = 2, scale = 10)

draw from a Gamma distribution with a shape of 2 and a scale of 10.

The currently supported distributions are (alphabetically):

Beta – the Beta distribution with parameters a and b (vs rbeta’s shape1 and shape2)
Binomial – the binomial distribution with parameters size and prob
Exponential – the exponential distribution with two forms:
- rate (default); this is the same parameterisation as rexp
- mean which is the inverse of rate. NOTE: we may change this to scale soon
Gamma – the gamma distribution with two forms:
- shape, rate (default)
- shape, scale
Hypergeometric – the hypergeometric distribution with parameters m (number of white balls), n (number of black balls), and k (number of samples), and we return the number of white balls. We may support alternative parametrisations of this distribution in future (this version is the same parametrisation as rhyper)
Normal – the normal distribution with parameters mean, sd
Poisson – the Poisson distribution with parameter lambda (the mean)
Uniform – the uniform distribution with parameters min and max

In the future, we plan support for additional distributions, please let us know if we are missing any that you need. The support for these functions comes from monty and we will link here to the docs in that package once they exist for additional details.

Semantics of random number draws

Stochastic functions are called for each element in an array they are assigned to, at each time. So here:

x[] <- Normal(0, 1)

x will be filled with each element having a different draw from a standard normal. In contrast, in:

a <- Normal(0, 1)
x[] <- a

x will be a vector where every element is the same, the result of a single draw from a standard normal.

Rich FitzJohn

2024-09-18