Quick exact-marginal-likelihood estimate of the kernel hyperparameters
infer_kernel_params.RdEstimates `(length_scale, periodic_scale, long_term_scale)` plus a noise-to-signal ratio by maximising the exact GP marginal likelihood of a plug-in latent field, using the Kronecker eigendecomposition so the full `(n * nt)`-square covariance is never formed. Fast and deterministic – no MCMC, no iterative solver.
Usage
infer_kernel_params(
obs_data,
coordinates,
nt,
period,
value = "y_obs",
standardise = TRUE,
priors = default_kernel_priors(),
start = c(length_scale = 1, periodic_scale = 1, long_term_scale = 100, nugget_ratio =
0.1),
n_sites = NULL
)Arguments
- obs_data
Data frame with `id` (site), `t` (time) and the count column named by `value`. `t` is a numeric time index whose *differences* encode real elapsed time, so gaps and uneven spacing between time points are modelled as genuine time distances (use e.g. weeks or days since a reference). [gp_predict()] must be given the same `t` encoding.
- coordinates
Site coordinates (data frame with `lon`, `lat`), ordered to match `sort(unique(obs_data$id))`.
- nt
Number of time points.
- period
Period of the seasonal cycle, in the same units as `t`.
- value
Name of the count column (default `"y_obs"`).
- standardise
Logical; standardise the plug-in field per site (default `TRUE`).
- priors
Log-normal priors, see [default_kernel_priors()].
- start
Named/length-4 starting values on the natural scale (`length_scale`, `periodic_scale`, `long_term_scale`, `nugget_ratio`).
- n_sites
Optional integer. If supplied and smaller than the number of sites, the hyperparameters are estimated from a random subsample of this many sites. The kernel hyperparameters are shared, population-level quantities, so a representative site subsample estimates the same length scales at a fraction of the \(O(n^3)\) cost – useful for very large site counts. Default `NULL` uses all sites. The subsample is drawn from the current RNG state, so set a seed beforehand (e.g. [set.seed()]) for a reproducible estimate. Note: this subsamples *sites* only, not time points (the temporal kernel needs the full series to resolve the periodic and long-term scales).