class: inverse, middle, left, my-title-slide, title-slide .title[ # Time series modeling with Bayesian Dynamic Generalized Additive Models ] .author[ ### Nicholas Clark ] .institute[ ### School of Veterinary Science, University of Queensland, Australia ] .date[ ### Wednesday 13th December, 2023 ] --- class: middle center class: animated fadeIn <body><div id="pan"></div></body> --- class: animated fadeIn background-image: url('./resources/smooth_only.gif') ## GAMs use splines `\((f(x))\)` ... --- background-image: url('./resources/basis_weights.gif') ## ...made of weighted basis functions --- background-image: url('./resources/smooth_to_data.gif') ## Penalize `\(f"(x)\)` to learn weights --- class: full-size ## Easy to fit in
<img align="center" width="306" height="464" src="resources/gam_book.jpg" style="float: right; margin: 10px;"> `$$\mathbb{E}(\boldsymbol{Y_t}|\boldsymbol{X_t})=g^{-1}(\alpha + \sum_{j=1}^{J}f(x_{jt}))$$` <br/> Where: - `\(g^{-1}\)` is the inverse of the link function - `\({\alpha}\)` is the intercept - `\(f(x)\)` are potentially nonlinear functions of the `\(J\)` predictors --- background-image: url('./ASC_talk_slidedeck_files/figure-html/complexity-1.svg') --- ## What's the catch? <img src="ASC_talk_slidedeck_files/figure-html/unnamed-chunk-1-1.svg" style="display: block; margin: auto;" /> --- ## A spline of `time` ```r library(mgcv) *model <- gam(y ~ s(time, k = 20, bs = 'bs', m = 2), data = data, family = gaussian()) ``` A B-spline (`bs = 'bs'`) with `m = 2` sets the penalty on the second derivative --- ## Hindcasts
<img src="ASC_talk_slidedeck_files/figure-html/unnamed-chunk-3-1.svg" style="display: block; margin: auto;" /> ``` ## No non-missing values in test_observations; cannot calculate forecast score ``` --- ## Basis functions ⇨ local knowledge <img src="ASC_talk_slidedeck_files/figure-html/unnamed-chunk-4-1.svg" style="display: block; margin: auto;" /> --- ## Basis functions ⇨ local knowledge <img src="ASC_talk_slidedeck_files/figure-html/unnamed-chunk-5-1.svg" style="display: block; margin: auto;" /> --- ## Forecasts
<img src="ASC_talk_slidedeck_files/figure-html/unnamed-chunk-6-1.svg" style="display: block; margin: auto;" /> --- ## We need *global* knowledge <img src="ASC_talk_slidedeck_files/figure-html/unnamed-chunk-7-1.svg" style="display: block; margin: auto;" /> --- ## We need *global* knowledge <img src="ASC_talk_slidedeck_files/figure-html/unnamed-chunk-8-1.svg" style="display: block; margin: auto;" /> --- ## Dynamic GAMs `$$\mathbb{E}(\boldsymbol{Y_t}|\boldsymbol{X_t})=g^{-1}(\alpha + \sum_{j=1}^{J}f(x_{jt}) + z_t)$$` <br/> Where: - `\(g^{-1}\)` is the inverse of the link function - `\({\alpha}\)` is the intercept - `\(f(x)\)` are potentially nonlinear functions of the `\(J\)` predictors - `\(z_t\)` is a .emphasize[*latent dynamic process*] --- ## Modelling with the [`mvgam` 📦](https://github.com/nicholasjclark/mvgam/tree/master) Bayesian framework to fit Dynamic GLMs and Dynamic GAMs - Hierarchical intercepts, slopes and smooths - Latent dynamic processes - State-Space models with measurement error Built off the [`mgcv` 📦](https://cran.r-project.org/web/packages/mgcv/index.html) to construct penalized smoothing splines Familiar
formula interface Uni- or multivariate series from a range of response distributions Uses [Stan](https://mc-stan.org/) for ADVI, Laplace or full Hamiltonian Monte Carlo --- class: middle center ### We can fit models that include random effects, nonlinear effects, time-varying effects and complex multidimensional smooth functions. All these effects can operate .emphasize[*on both process and observation models*] <br> ### Can incorporate unobserved temporal dynamics; no need to regress the outcome on past values or resort to transformations <br> ### What kinds of dynamic processes are available in the `mvgam` 📦? --- ## Piecewise linear... <img src="ASC_talk_slidedeck_files/figure-html/unnamed-chunk-9-1.svg" style="display: block; margin: auto;" /> --- ## ...or logistic with upper saturation <img src="ASC_talk_slidedeck_files/figure-html/unnamed-chunk-10-1.svg" style="display: block; margin: auto;" /> --- ## RW or ARMA(p = 1-3, q = 0-1) <img src="ASC_talk_slidedeck_files/figure-html/unnamed-chunk-11-1.svg" style="display: block; margin: auto;" /> --- ## Gaussian Process... <img src="ASC_talk_slidedeck_files/figure-html/unnamed-chunk-12-1.svg" style="display: block; margin: auto;" /> --- ## ...where length scale ⇨ *memory* <img src="ASC_talk_slidedeck_files/figure-html/unnamed-chunk-13-1.svg" style="display: block; margin: auto;" /> --- background-image: url('./resources/VAR.svg') background-size: contain ## VAR1 ⇨ Granger causality --- background-image: url('./resources/df_with_series.gif') ## Factors ⇨ induced correlations --- ## Example of the interface ```r model <- mvgam( formula = y ~ s(series, bs = 're') + s(x0, series, bs = 're') + x1 + gp(x2) + te(x3, x4, bs = c('cr', 'tp')), data = data, family = poisson(), trend_model = AR(p = 1, ma = TRUE, cor = TRUE), algorithm = 'sampling', burnin = 500, samples = 500, chains = 4, parallel = TRUE) ``` --- ## Produce all `Stan` code and objects ```r code(model) ``` .small[ ``` ## // Stan model code generated by package mvgam ## functions { ## /* Spectral density function of a Gaussian process ## * with squared exponential covariance kernel ## * Args: ## * l_gp: numeric eigenvalues of an SPD GP ## * alpha_gp: marginal SD parameter ## * rho_gp: length-scale parameter ## * Returns: ## * numeric values of the GP function evaluated at l_gp ## */ ## vector spd_cov_exp_quad(data vector l_gp, real alpha_gp, real rho_gp) { ## int NB = size(l_gp); ## vector[NB] out; ## real constant = square(alpha_gp) * (sqrt(2 * pi()) * rho_gp); ## real neg_half_lscale2 = -0.5 * square(rho_gp); ## for (m in 1 : NB) { ## out[m] = constant * exp(neg_half_lscale2 * square(l_gp[m])); ## } ## return out; ## } ## } ## data { ## int<lower=0> total_obs; // total number of observations ## int<lower=0> n; // number of timepoints per series ## int<lower=0> n_sp; // number of smoothing parameters ## int<lower=0> n_series; // number of series ## int<lower=0> num_basis; // total number of basis coefficients ## vector[num_basis] zero; // prior locations for basis coefficients ## matrix[total_obs, num_basis] X; // mgcv GAM design matrix ## array[n, n_series] int<lower=0> ytimes; // time-ordered matrix (which col in X belongs to each [time, series] observation?) ## int<lower=1> k_gp_x2_; // basis functions for approximate gp ## vector[k_gp_x2_] l_gp_x2_; // approximate gp eigenvalues ## array[10] int b_idx_gp_x2_; // gp basis coefficient indices ## matrix[24, 72] S2; // mgcv smooth penalty matrix S2 ## int<lower=0> n_nonmissing; // number of nonmissing observations ## array[n_nonmissing] int<lower=0> flat_ys; // flattened nonmissing observations ## matrix[n_nonmissing, num_basis] flat_xs; // X values for nonmissing observations ## array[n_nonmissing] int<lower=0> obs_ind; // indices of nonmissing observations ## } ## transformed data { ## vector[n_series] trend_zeros = rep_vector(0.0, n_series); ## } ## parameters { ## // raw basis coefficients ## vector[num_basis] b_raw; ## ## // gp term sd parameters ## real<lower=0> alpha_gp_x2_; ## ## // gp term length scale parameters ## real<lower=0> rho_gp_x2_; ## ## // gp term latent variables ## vector[k_gp_x2_] z_gp_x2_; ## ## // random effect variances ## vector<lower=0>[2] sigma_raw; ## ## // random effect means ## vector[2] mu_raw; ## ## // latent trend AR1 terms ## vector<lower=-1.5, upper=1.5>[n_series] ar1; ## ## // latent trend variance parameters ## vector<lower=0>[n_series] sigma; ## cholesky_factor_corr[n_series] L_Omega; ## ## // ma coefficients ## matrix<lower=-1, upper=1>[n_series, n_series] theta; ## ## // dynamic error parameters ## array[n] vector[n_series] error; ## ## // smoothing parameters ## vector<lower=0>[n_sp] lambda; ## } ## transformed parameters { ## array[n] vector[n_series] trend_raw; ## matrix[n, n_series] trend; ## array[n] vector[n_series] epsilon; ## ## // LKJ form of covariance matrix ## matrix[n_series, n_series] L_Sigma; ## ## // computed error covariance matrix ## cov_matrix[n_series] Sigma; ## ## // basis coefficients ## vector[num_basis] b; ## b[1 : 36] = b_raw[1 : 36]; ## b[37 : 40] = mu_raw[1] + b_raw[37 : 40] * sigma_raw[1]; ## b[41 : 44] = mu_raw[2] + b_raw[41 : 44] * sigma_raw[2]; ## ## // derived latent states ## trend_raw[1] = error[1]; ## epsilon[1] = error[1]; ## for (i in 2 : n) { ## // lagged error ma process ## epsilon[i] = theta * error[i - 1]; ## ## // full ARMA process ## trend_raw[i] = ar1 .* trend_raw[i - 1] + epsilon[i] + error[i]; ## } ## b[b_idx_gp_x2_] = sqrt(spd_cov_exp_quad(l_gp_x2_, alpha_gp_x2_, rho_gp_x2_)) ## .* z_gp_x2_; ## L_Sigma = diag_pre_multiply(sigma, L_Omega); ## Sigma = multiply_lower_tri_self_transpose(L_Sigma); ## for (i in 1 : n) { ## trend[i, 1 : n_series] = to_row_vector(trend_raw[i]); ## } ## } ## model { ## // prior for random effect population variances ## sigma_raw ~ student_t(3, 0, 2.5); ## ## // prior for random effect population means ## mu_raw ~ std_normal(); ## ## // prior for (Intercept)... ## b_raw[1] ~ student_t(3, 0.7, 2.5); ## ## // prior for x1B... ## b_raw[2] ~ student_t(3, 0, 2); ## ## // prior for gp(x2)... ## z_gp_x2_ ~ std_normal(); ## alpha_gp_x2_ ~ student_t(3, 0, 3); ## rho_gp_x2_ ~ inv_gamma(1.494197, 0.056607); ## b_raw[b_idx_gp_x2_] ~ std_normal(); ## ## // prior for te(x3,x4)... ## b_raw[13 : 36] ~ multi_normal_prec(zero[13 : 36], ## S2[1 : 24, 1 : 24] * lambda[3] ## + S2[1 : 24, 25 : 48] * lambda[4] ## + S2[1 : 24, 49 : 72] * lambda[5]); ## ## // prior (non-centred) for s(series)... ## b_raw[37 : 40] ~ std_normal(); ## ## // prior (non-centred) for s(x0,series)... ## b_raw[41 : 44] ~ std_normal(); ## ## // priors for AR parameters ## ar1 ~ std_normal(); ## ## // priors for smoothing parameters ## lambda ~ normal(5, 30); ## ## // priors for latent trend variance parameters ## sigma ~ student_t(3, 0, 2.5); ## ## // contemporaneous errors ## L_Omega ~ lkj_corr_cholesky(2); ## for (i in 1 : n) { ## error[i] ~ multi_normal_cholesky(trend_zeros, L_Sigma); ## } ## ## // ma coefficients ## for (i in 1 : n_series) { ## for (j in 1 : n_series) { ## if (i != j) ## theta[i, j] ~ normal(0, 0.2); ## } ## } ## { ## // likelihood functions ## vector[n_nonmissing] flat_trends; ## flat_trends = to_vector(trend)[obs_ind]; ## flat_ys ~ poisson_log_glm(append_col(flat_xs, flat_trends), 0.0, ## append_row(b, 1.0)); ## } ## } ## generated quantities { ## vector[total_obs] eta; ## matrix[n, n_series] mus; ## vector[n_sp] rho; ## vector[n_series] tau; ## array[n, n_series] int ypred; ## rho = log(lambda); ## for (s in 1 : n_series) { ## tau[s] = pow(sigma[s], -2.0); ## } ## ## // posterior predictions ## eta = X * b; ## for (s in 1 : n_series) { ## mus[1 : n, s] = eta[ytimes[1 : n, s]] + trend[1 : n, s]; ## ypred[1 : n, s] = poisson_log_rng(mus[1 : n, s]); ## } ## } ``` ] --- ## Workflow in `mvgam` 📦 Fit models with splines, GPs, and multivariate dynamic processes to sets of time series; use informative priors for effective regularization Use posterior predictive checks and Randomized Quantile residuals to assess model failures Use `marginaleffects` 📦 to generate interpretable (and reportable) model predictions Produce probabilistic forecasts Evaluate forecasts from using proper scoring rules --- ## More resources Cheatsheet ⇨ [Overview of `mvgam`](https://github.com/nicholasjclark/mvgam/raw/master/misc/mvgam_cheatsheet.pdf) Vignette ⇨ [Introduction to the package](https://nicholasjclark.github.io/mvgam/articles/mvgam_overview.html) Vignette ⇨ [Shared latent process models](https://nicholasjclark.github.io/mvgam/articles/shared_states.html) Vignette ⇨ [Time-varying effects](https://nicholasjclark.github.io/mvgam/articles/time_varying_effects.html) Vignette ⇨ [Multivariate State-Space models](https://nicholasjclark.github.io/mvgam/articles/trend_formulas.html) Motivating publication ⇨ Clark & Wells 2023 [*Methods in Ecology and Evolution*](https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/2041-210X.13974) --- ## Relevant links [`mvgam` 📦 website](https://nicholasjclark.github.io/mvgam/) [
nicholasjclark](https://github.com/nicholasjclark) [
slides for this talk](https://github.com/nicholasjclark/ASC_talk) [
personal website](https://researchers.uq.edu.au/researcher/15140)
n.clark@uq.edu.au