Ecological forecasting with dynamic GAMs

.title[
# Ecological forecasting with dynamic GAMs
]
.author[
### Nicholas Clark
]
.institute[
### School of Veterinary Science, University of Queensland
]
.date[
### 1200–1300 Thursday 12th October, 2023
]

---

## About me
<img src="resources/NicholasClark.jpg" style="position:fixed; right:8%; top:9%; width:239px; height:326px; border:none;" />

ARC Discovery Early Career Fellow

The University of Queensland
- School of Veterinary Science 
- Located in Gatton, Australia

Interested in:
- Quantitative ecology
- Molecular genetics
- Multivariate time series modelling

---

class: middle center
###“Because all decision making is based on what will happen in the future, either under the status quo or different decision alternatives, decision making ultimately depends on forecasts”
  
[Dietze et al. 2018](https://ecoforecast.org/about/)

---

background-image: url('./resources/big_data.gif')
background-size: contain
background-color: #F2F2F2

---

### Can we use common time series models in ecology?

---

## *Very* easy to apply in <svg aria-hidden="true" role="img" viewBox="0 0 581 512" style="height:1em;width:1.13em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:steelblue;overflow:visible;position:relative;"><path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"/></svg>

Hyndman’s tools in the [`forecast` 📦](https://pkg.robjhyndman.com/forecast/) are hugely popular and accessible for time series analysis / forecasting 
  
[ETS](https://pkg.robjhyndman.com/forecast/reference/ets.html) handles many types of seasonality and nonlinear trends 
  
[Regression with ARIMA errors](https://pkg.robjhyndman.com/forecast/reference/auto.arima.html) includes additive effects of predictors while capturing trends and seasonality

*Some* of these algorithms can handle missing data

*All* are fast to fit and forecast, but assume observations are *Gaussian*

---

class: middle center
### But most real-world ecological observations, including time series, *are not Gaussian*

---

<div class="figure" style="text-align: center">
<img src="QUT_talk_slidedeck_files/figure-html/unnamed-chunk-1-1.svg" alt="Properties of lunar monthly Desert Pocket Mouse capture time series from a long-term monitoring study in Portal, Arizona, USA"  />
<p class="caption">Properties of lunar monthly Desert Pocket Mouse capture time series from a long-term monitoring study in Portal, Arizona, USA</p>
</div>

---

<div class="figure" style="text-align: center">
<img src="QUT_talk_slidedeck_files/figure-html/unnamed-chunk-2-1.svg" alt="Properties of annual American kestrel abundance time series in British Columbia, Canada"  />
<p class="caption">Properties of annual American kestrel abundance time series in British Columbia, Canada</p>
</div>

---

class: middle center
###“If our data contains small counts (0,1,2,...), then we need to use forecasting methods that are more appropriate for a sample space of non-negative integers. 
<br>
### *Such models are beyond the scope of this book*”
  
[Hyndman and Athanasopoulos, Forecasting Principles and Practice](https://otexts.com/fpp3/counts.html)

---
class: black-inverse
.center[.grey[.big[Ok. So now what?]]]
<img src="resources/now_what.gif" style="position:fixed; right:10%; top:20%; width:960px; height:408px; border:none;"/>

---

# Poisson GLM for counts?
A Poisson GLM models the conditional mean with a `\(log\)` link
<br/>
<br/>
`\begin{align*}
\boldsymbol{Y}_t & \sim \text{Poisson}(\lambda_t) \\
log(\lambda_t) & = \boldsymbol{X}_t \beta \\
& = \color{darkred}{\alpha + \beta_1 \boldsymbol{x}_{1t} + \beta_2 \boldsymbol{x}_{2t} + \cdots + \beta_j \boldsymbol{x}_{jt}}
\end{align*}`

The .emphasize[*linear predictor component can be hugely flexible*], as we will see in a moment

---

### GLMs allow us to build models that respect the bounds and distributions of our observed data
<br>
### They traditionally assume the appropriately transformed mean response depends *linearly* on the predictors
<br>
### But there are many other properties we'd like to model

---

## Properties of ecological series
Temporal autocorrelation

Lagged effects

Non-Gaussian data and missing observations

Measurement error

Time-varying effects

Nonlinearities

Multi-series clustering

---

## Properties of ecological series
.grey[Temporal autocorrelation

Lagged effects

Non-Gaussian data and missing observations

Measurement error

Time-varying effects]

---
class: animated fadeIn
<body><div id="pan"></div></body>

---
background-image: url('./resources/smooth_only.gif')
## GAMs use splines ...

---

background-image: url('resources/basis-functions-1.svg')
## ... made of basis functions

---

background-image: url('resources/basis-functions-weights-1.svg')
## Weighting basis functions ...

---

background-image: url('./resources/basis_weights.gif')
## ... gives a spline `\((f(x))\)`

---

background-image: url('./resources/penalty_spline.gif')
background-size: contain
## Penalize `\(f"(x)\)` to learn weights

---

class: middle center
### GAMs are just fancy GLMs, where some (or all) of the predictor effects are estimated as (possibly nonlinear) smooth functions
<br>
### But the complexity they can handle is *enormous*

---

---

## Modelling with the [`mvgam` 📦](https://github.com/nicholasjclark/mvgam/tree/master)

Bayesian framework to fit Dynamic GLMs and Dynamic GAMs
- Hierarchical intercepts, slopes *and smooths*
- Latent dynamic processes
- State Space models with measurement error

Built off the [`mgcv` 📦](https://cran.r-project.org/web/packages/mgcv/index.html) to construct penalized smoothing splines

Convenient and familiar <svg aria-hidden="true" role="img" viewBox="0 0 581 512" style="height:1em;width:1.13em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:steelblue;overflow:visible;position:relative;"><path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"/></svg> formula interface

Uni- or multivariate series from a range of response distributions

Uses [Stan](https://mc-stan.org/) for efficient Hamiltonian Monte Carlo sampling

---

## Example of the interface

```r
model <- mvgam(
  formula = y ~ 
    s(series, bs = 're') + 
    s(x0, series, bs = 're') +
    x1 +
    s(x2, bs = 'tp', k = 5) +
    te(x3, x4, bs = c('cr', 'tp')),
  data = data,
  family = poisson(),
  trend_model = 'AR1',
  burnin = 500,
  samples = 500,
  chains = 4,
  parallel = TRUE
  )
```

---
## Typical formula syntax

```r
model <- mvgam(
* formula = y ~
*   s(series, bs = 're') +
*   s(x0, series, bs = 're') +
*   x1 +
*   s(x2, bs = 'tp', k = 5) +
*   te(x3, x4, bs = c('cr', 'tp')),
  data = data,
  family = poisson(),
  trend_model = 'AR1',
  burnin = 500,
  samples = 500,
  chains = 4,
  parallel = TRUE
  )
```

---

## Data and response distribution

```r
model <- mvgam(
  formula = y ~ 
    s(series, bs = 're') + 
    s(x0, series, bs = 're') + 
    x1 + 
    s(x2, bs = 'tp', k = 5) + 
    te(x3, x4, bs = c('cr', 'tp')), 
* data = data,
* family = poisson(),
  trend_model = 'AR1',
  burnin = 500,
  samples = 500,
  chains = 4,
  parallel = TRUE
  )
```

---

## <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:darkred;overflow:visible;position:relative;"><path d="M37.6 4.2C28-2.3 15.2-1.1 7 7s-9.4 21-2.8 30.5l112 163.3L16.6 233.2C6.7 236.4 0 245.6 0 256s6.7 19.6 16.6 22.8l103.1 33.4L66.8 412.8c-4.9 9.3-3.2 20.7 4.3 28.1s18.8 9.2 28.1 4.3l100.6-52.9 33.4 103.1c3.2 9.9 12.4 16.6 22.8 16.6s19.6-6.7 22.8-16.6l33.4-103.1 100.6 52.9c9.3 4.9 20.7 3.2 28.1-4.3s9.2-18.8 4.3-28.1L392.3 312.2l103.1-33.4c9.9-3.2 16.6-12.4 16.6-22.8s-6.7-19.6-16.6-22.8L388.9 198.7l25.7-70.4c3.2-8.8 1-18.6-5.6-25.2s-16.4-8.8-25.2-5.6l-70.4 25.7L278.8 16.6C275.6 6.7 266.4 0 256 0s-19.6 6.7-22.8 16.6l-32.3 99.6L37.6 4.2z"/></svg> latent dynamics

```r
model <- mvgam(
  formula = y ~ 
    s(series, bs = 're') + 
    s(x0, series, bs = 're') + 
    x1 + 
    s(x2, bs = 'tp', k = 5) + 
    te(x3, x4, bs = c('cr', 'tp')), 
  data = data,
  family = poisson(), 
* trend_model = 'AR1',
  burnin = 500,
  samples = 500,
  chains = 4,
  parallel = TRUE
  )
```

---

## Sampler parameters

```r
model <- mvgam(
  formula = y ~ 
    s(series, bs = 're') + 
    s(x0, series, bs = 're') + 
    x1 + 
    s(x2, bs = 'tp', k = 5) + 
    te(x3, x4, bs = c('cr', 'tp')), 
  data = data,
  family = poisson(), 
  trend_model = 'AR1', 
* burnin = 500,
* samples = 500,
* chains = 4,
* parallel = TRUE
  )
```

---

## Example: simulated data

---

## A spline of `time`

```r
library(mvgam)
model <- mvgam(y ~ 
*                s(time, k = 20, bs = 'bs', m = 2),
                data = data_train,
*               newdata = data_test,
                family = gaussian())
```

A B-spline (`bs = 'bs'`) with `m = 2` sets the penalty on the second derivative

Use `newdata` argument to generate automatic probabilistic forecasts

---

## Hindcasts <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M464 256A208 208 0 1 0 48 256a208 208 0 1 0 416 0zM0 256a256 256 0 1 1 512 0A256 256 0 1 1 0 256zm177.6 62.1C192.8 334.5 218.8 352 256 352s63.2-17.5 78.4-33.9c9-9.7 24.2-10.4 33.9-1.4s10.4 24.2 1.4 33.9c-22 23.8-60 49.4-113.6 49.4s-91.7-25.5-113.6-49.4c-9-9.7-8.4-24.9 1.4-33.9s24.9-8.4 33.9 1.4zm40-89.3l0 0 0 0-.2-.2c-.2-.2-.4-.5-.7-.9c-.6-.8-1.6-2-2.8-3.4c-2.5-2.8-6-6.6-10.2-10.3c-8.8-7.8-18.8-14-27.7-14s-18.9 6.2-27.7 14c-4.2 3.7-7.7 7.5-10.2 10.3c-1.2 1.4-2.2 2.6-2.8 3.4c-.3 .4-.6 .7-.7 .9l-.2 .2 0 0 0 0 0 0c-2.1 2.8-5.7 3.9-8.9 2.8s-5.5-4.1-5.5-7.6c0-17.9 6.7-35.6 16.6-48.8c9.8-13 23.9-23.2 39.4-23.2s29.6 10.2 39.4 23.2c9.9 13.2 16.6 30.9 16.6 48.8c0 3.4-2.2 6.5-5.5 7.6s-6.9 0-8.9-2.8l0 0 0 0zm160 0l0 0-.2-.2c-.2-.2-.4-.5-.7-.9c-.6-.8-1.6-2-2.8-3.4c-2.5-2.8-6-6.6-10.2-10.3c-8.8-7.8-18.8-14-27.7-14s-18.9 6.2-27.7 14c-4.2 3.7-7.7 7.5-10.2 10.3c-1.2 1.4-2.2 2.6-2.8 3.4c-.3 .4-.6 .7-.7 .9l-.2 .2 0 0 0 0 0 0c-2.1 2.8-5.7 3.9-8.9 2.8s-5.5-4.1-5.5-7.6c0-17.9 6.7-35.6 16.6-48.8c9.8-13 23.9-23.2 39.4-23.2s29.6 10.2 39.4 23.2c9.9 13.2 16.6 30.9 16.6 48.8c0 3.4-2.2 6.5-5.5 7.6s-6.9 0-8.9-2.8l0 0 0 0 0 0z"/></svg>

```
## No non-missing values in test_observations; cannot calculate forecast score
```

---

## Extrapolate 2-steps ahead <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M464 256A208 208 0 1 0 48 256a208 208 0 1 0 416 0zM0 256a256 256 0 1 1 512 0A256 256 0 1 1 0 256zm177.6 62.1C192.8 334.5 218.8 352 256 352s63.2-17.5 78.4-33.9c9-9.7 24.2-10.4 33.9-1.4s10.4 24.2 1.4 33.9c-22 23.8-60 49.4-113.6 49.4s-91.7-25.5-113.6-49.4c-9-9.7-8.4-24.9 1.4-33.9s24.9-8.4 33.9 1.4zM144.4 208a32 32 0 1 1 64 0 32 32 0 1 1 -64 0zm192-32a32 32 0 1 1 0 64 32 32 0 1 1 0-64z"/></svg>

---

## 5-steps ahead <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M464 256A208 208 0 1 0 48 256a208 208 0 1 0 416 0zM0 256a256 256 0 1 1 512 0A256 256 0 1 1 0 256zM174.6 384.1c-4.5 12.5-18.2 18.9-30.7 14.4s-18.9-18.2-14.4-30.7C146.9 319.4 198.9 288 256 288s109.1 31.4 126.6 79.9c4.5 12.5-2 26.2-14.4 30.7s-26.2-2-30.7-14.4C328.2 358.5 297.2 336 256 336s-72.2 22.5-81.4 48.1zM144.4 208a32 32 0 1 1 64 0 32 32 0 1 1 -64 0zm192-32a32 32 0 1 1 0 64 32 32 0 1 1 0-64z"/></svg>

---

## 20-steps ahead <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M175.9 448c-35-.1-65.5-22.6-76-54.6C67.6 356.8 48 308.7 48 256C48 141.1 141.1 48 256 48s208 93.1 208 208s-93.1 208-208 208c-28.4 0-55.5-5.7-80.1-16zM0 256a256 256 0 1 0 512 0A256 256 0 1 0 0 256zM128 369c0 26 21.5 47 48 47s48-21 48-47c0-20-28.4-60.4-41.6-77.7c-3.2-4.4-9.6-4.4-12.8 0C156.6 308.6 128 349 128 369zm128-65c-13.3 0-24 10.7-24 24s10.7 24 24 24c30.7 0 58.7 11.5 80 30.6c9.9 8.8 25 8 33.9-1.9s8-25-1.9-33.9C338.3 320.2 299 304 256 304zm47.6-96a32 32 0 1 0 64 0 32 32 0 1 0 -64 0zm-128 32a32 32 0 1 0 0-64 32 32 0 1 0 0 64z"/></svg>

---
## Forecasts <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M400 406.1V288c0-13.3-10.7-24-24-24s-24 10.7-24 24V440.6c-28.7 15-61.4 23.4-96 23.4s-67.3-8.5-96-23.4V288c0-13.3-10.7-24-24-24s-24 10.7-24 24V406.1C72.6 368.2 48 315 48 256C48 141.1 141.1 48 256 48s208 93.1 208 208c0 59-24.6 112.2-64 150.1zM256 512A256 256 0 1 0 256 0a256 256 0 1 0 0 512zM159.6 220c10.6 0 19.9 3.8 25.4 9.7c7.6 8.1 20.2 8.5 28.3 .9s8.5-20.2 .9-28.3C199.7 186.8 179 180 159.6 180s-40.1 6.8-54.6 22.3c-7.6 8.1-7.1 20.7 .9 28.3s20.7 7.1 28.3-.9c5.5-5.8 14.8-9.7 25.4-9.7zm166.6 9.7c5.5-5.8 14.8-9.7 25.4-9.7s19.9 3.8 25.4 9.7c7.6 8.1 20.2 8.5 28.3 .9s8.5-20.2 .9-28.3C391.7 186.8 371 180 351.6 180s-40.1 6.8-54.6 22.3c-7.6 8.1-7.1 20.7 .9 28.3s20.7 7.1 28.3-.9zM208 320v32c0 26.5 21.5 48 48 48s48-21.5 48-48V320c0-26.5-21.5-48-48-48s-48 21.5-48 48z"/></svg>

---

background-image: url('resources/basis-functions-weights-1.svg')
## Basis functions &#8680; local knowledge

---

# Latent dynamic processes

---

## Latent dynamics in `mvgam` 📦
State-Space models where a latent process evolves to capture unobserved temporal dynamics
- RW
- AR (1-3)
- Gaussian Process (squared exponential kernel)
- Multivariate processes

Can estimate effects of predictors (including splines and random effects) in .emphasize[*both the observation and latent process models*]
---

# Dynamic Poisson GLM
A dynamic Poisson GLM can use .emphasize[*dynamic latent residuals*]
<br/>
<br/>
`\begin{align*}
\boldsymbol{Y}_t & \sim \text{Poisson}(\lambda_t) \\
log(\lambda_t) & = \alpha + \cdots + z_t \\
z_t & \sim \text{MVNormal}(0, \Sigma) \\
\Sigma_{t_i, t_j} & = \alpha^2 * exp(-0.5 * ((|t_i - t_j| / \rho))^2)
\end{align*}`

Where: 
- `\(\alpha\)` controls the marginal variability (magnitude) of the function
- `\(\rho\)` controls how correlations decay as a function of time lag
- `\(\Sigma\)` is the kernel, in this case a squared exponential kernel

---

## Dynamics &#8680; *global knowledge*
<img src="QUT_talk_slidedeck_files/figure-html/unnamed-chunk-15-1.svg" style="display: block; margin: auto;" />

---

background-image: url('./resources/pp_image.jpg')
background-size: cover
background-color: #77654E
---

<div class="figure" style="text-align: center">
<img src="QUT_talk_slidedeck_files/figure-html/unnamed-chunk-16-1.svg" alt="Properties of Merriam's kangaroo rat relative abundance time series from a long-term monitoring study in Portal, Arizona, USA"  />
<p class="caption">Properties of Merriam's kangaroo rat relative abundance time series from a long-term monitoring study in Portal, Arizona, USA</p>
</div>

---

## Dynamic Beta GAM

```r
mod_beta <- mvgam(relabund ~ 
*                   te(mintemp, ndvi),
*                 trend_model = 'AR3',
                  family = betar(), 
                  data = dm_data)
```

Beta regression using the `mgcv` 📦's `betar` family

AR3 dynamic trend model

Multidimensional [tensor product smooth function for nonlinear covariate interactions (using `te`)](https://fromthebottomoftheheap.net/2015/11/21/climate-change-and-spline-interactions/)

---

## The latent dynamics
<img src="QUT_talk_slidedeck_files/figure-html/unnamed-chunk-19-1.svg" style="display: block; margin: auto;" />

---
## Coefficients?
.small[

```r
coef(mod_beta)
```

```
##                            2.5%         50%        97.5% Rhat n.eff
## (Intercept)         -0.27704943 -0.11949450  0.037417205 1.01   366
## te(mintemp,ndvi).1   0.09777064  1.58677000  2.435108500 1.02   123
## te(mintemp,ndvi).2   0.12734882  1.28906000  2.018242500 1.02   142
## te(mintemp,ndvi).3  -0.02043112  0.87295550  1.707162750 1.01   385
## te(mintemp,ndvi).4  -3.80155025 -0.62739600  1.895769000 1.00   822
## te(mintemp,ndvi).5   0.20000635  0.65926500  1.102021500 1.01   351
## te(mintemp,ndvi).6  -1.87098075 -0.78675750  0.713037000 1.03   136
## te(mintemp,ndvi).7  -0.44178235 -0.05658210  0.489970700 1.03   148
## te(mintemp,ndvi).8  -0.72445805 -0.29076650  0.232700575 1.02   188
## te(mintemp,ndvi).9  -2.08312200 -0.68123850  0.510886500 1.00   600
## te(mintemp,ndvi).10  0.04609435  0.50692150  0.930659100 1.00   795
## te(mintemp,ndvi).11 -0.60098247 -0.21375750  0.212092350 1.02   145
## te(mintemp,ndvi).12 -0.21238740 -0.05144905  0.114180750 1.01   273
## te(mintemp,ndvi).13 -1.62903200 -0.96678800  0.015499635 1.02   136
## te(mintemp,ndvi).14 -1.75806450 -0.72884100  0.195621150 1.01   410
## te(mintemp,ndvi).15 -0.96620883 -0.21764150  0.540176950 1.00   299
## te(mintemp,ndvi).16 -1.71074775 -1.01050500 -0.005658116 1.02   123
## te(mintemp,ndvi).17 -0.52369400 -0.26230500 -0.020589035 1.02   180
## te(mintemp,ndvi).18 -2.60926150 -1.66896500 -0.155125775 1.02   126
## te(mintemp,ndvi).19 -1.57559150 -0.80387600  0.070359030 1.01   297
## te(mintemp,ndvi).20 -1.65781575 -0.60831500  0.539254500 1.01   254
## te(mintemp,ndvi).21 -1.41296450 -0.89016800 -0.105312950 1.01   130
## te(mintemp,ndvi).22 -1.07243650 -0.73861450 -0.185083775 1.01   133
## te(mintemp,ndvi).23 -1.83864725 -1.28679000 -0.340361575 1.02   121
## te(mintemp,ndvi).24 -1.71563675 -0.69242550  0.398602750 1.00   553
```
]
---

class: black-inverse
.center[.grey[.big[How do I report this garbage?]]]
<img src="resources/confused.gif" style="position:fixed; right:10%; top:20%; width:960px; height:518px; border:none;"/>

---

## Plot the smooth?
<img src="QUT_talk_slidedeck_files/figure-html/unnamed-chunk-21-1.svg" style="display: block; margin: auto;" />

---

class: black-inverse
<img src="resources/marginaleffects_need.jpg" style="position:fixed; right:40%; top:1%; width:233px; height:658px; border:none;"/>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
.small[[Credit @stephenjwild](https://twitter.com/stephenjwild/status/1687499914794643456?s=20)]
---

## `marginaleffects` for clarity

```r
# plot conditional effect of BOTH covariates on the outcome scale
plot_predictions(mod_beta, condition = c('ndvi', 'mintemp'),
                 points = 0.5, conf_level = 0.8, rug = TRUE) +
  theme_classic()
```

]

]
]

---
## Hindcasts <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M464 256A208 208 0 1 0 48 256a208 208 0 1 0 416 0zM0 256a256 256 0 1 1 512 0A256 256 0 1 1 0 256zm177.6 62.1C192.8 334.5 218.8 352 256 352s63.2-17.5 78.4-33.9c9-9.7 24.2-10.4 33.9-1.4s10.4 24.2 1.4 33.9c-22 23.8-60 49.4-113.6 49.4s-91.7-25.5-113.6-49.4c-9-9.7-8.4-24.9 1.4-33.9s24.9-8.4 33.9 1.4zm40-89.3l0 0 0 0-.2-.2c-.2-.2-.4-.5-.7-.9c-.6-.8-1.6-2-2.8-3.4c-2.5-2.8-6-6.6-10.2-10.3c-8.8-7.8-18.8-14-27.7-14s-18.9 6.2-27.7 14c-4.2 3.7-7.7 7.5-10.2 10.3c-1.2 1.4-2.2 2.6-2.8 3.4c-.3 .4-.6 .7-.7 .9l-.2 .2 0 0 0 0 0 0c-2.1 2.8-5.7 3.9-8.9 2.8s-5.5-4.1-5.5-7.6c0-17.9 6.7-35.6 16.6-48.8c9.8-13 23.9-23.2 39.4-23.2s29.6 10.2 39.4 23.2c9.9 13.2 16.6 30.9 16.6 48.8c0 3.4-2.2 6.5-5.5 7.6s-6.9 0-8.9-2.8l0 0 0 0zm160 0l0 0-.2-.2c-.2-.2-.4-.5-.7-.9c-.6-.8-1.6-2-2.8-3.4c-2.5-2.8-6-6.6-10.2-10.3c-8.8-7.8-18.8-14-27.7-14s-18.9 6.2-27.7 14c-4.2 3.7-7.7 7.5-10.2 10.3c-1.2 1.4-2.2 2.6-2.8 3.4c-.3 .4-.6 .7-.7 .9l-.2 .2 0 0 0 0 0 0c-2.1 2.8-5.7 3.9-8.9 2.8s-5.5-4.1-5.5-7.6c0-17.9 6.7-35.6 16.6-48.8c9.8-13 23.9-23.2 39.4-23.2s29.6 10.2 39.4 23.2c9.9 13.2 16.6 30.9 16.6 48.8c0 3.4-2.2 6.5-5.5 7.6s-6.9 0-8.9-2.8l0 0 0 0 0 0z"/></svg>
<img src="QUT_talk_slidedeck_files/figure-html/unnamed-chunk-22-1.svg" style="display: block; margin: auto;" />

---

# Multivariate ecological time series

---

## Hierarchical *nonlinear* effects?
We very often expect to encounter nonlinear effects in ecology

But if we measure multiple species / plots / individuals etc.. through time, we can also encounter hierarchical nonlinear effects
- Same species may respond similarly to environmental change over different sites
- Different species may respond similarly in the same site

Our data may not be rich enough to estimate all effects individually; so what can we do?

---
## Example: simulated data
<img src="QUT_talk_slidedeck_files/figure-html/unnamed-chunk-23-1.svg" style="display: block; margin: auto;" />

---
## Similar seasonal patterns
<img src="QUT_talk_slidedeck_files/figure-html/unnamed-chunk-24-1.svg" style="display: block; margin: auto;" />

---

class: middle center
### Can we somehow estimate the average population smooth *and* a smooth to determine how each series deviates from the population?
<br>
### Yes! We can use .multicolor[hierarchical GAMs]

---

## Decomposing seasonality
<img src="QUT_talk_slidedeck_files/figure-html/unnamed-chunk-25-1.svg" style="display: block; margin: auto;" />

---

## How did we model this?

```r
mod <- mvgam(y ~ 
*              s(season, bs = 'cc', k = 12) +
               s(season, series, k = 4, bs = 'fs'),
             data = data, 
             family = poisson())
```

A .emphasize[*shared*] smooth of seasonality

This is a group-level smooth, similar to what we might expect the average seasonal function to be in this set of series
---

## How did we model this?

```r
mod <- mvgam(y ~ 
               s(season, bs = 'cc', k = 12) + 
*              s(season, series, k = 4, bs = 'fs'),
             data = data, 
             family = poisson())
```

Series-level .emphasize[*deviation*] smooths of seasonality, which all share a common smoothing penalty

These are individual-level smooths that capture how each series' seasonal pattern differs from the shared smooth
- There are a number of ways to do this using splines
- See [Pedersen et al 2019](https://peerj.com/articles/6876/) for useful guidance

---
## Conditional predictions
<img src="QUT_talk_slidedeck_files/figure-html/unnamed-chunk-28-1.svg" style="display: block; margin: auto;" />

---

## Posterior contrasts

```r
# take draws of average comparison between season = 9 vs season = 3
post_contrasts <- avg_comparisons(model, 
                                  variables = list(season = c(9, 3)),
                                  proces_error = FALSE) %>%
  posteriordraws()

# use the resulting posterior draw object to plot a density of the 
# posterior contrasts
library(tidybayes)
post_contrasts %>% ggplot(aes(x = draw)) +
  # use the stat_halfeye function from tidybayes for a nice visual
  stat_halfeye(fill = "#C79999") +
  labs(x = "(season = 9) − (season = 3)", y = "Density", 
       title = "Average posterior contrast") + theme_classic()
```

]

]
]

---

class: middle center
### HGAMs offer a solution to estimate the hierarchical, nonlinear effects that we think are common in ecology
<br>
### This is a huge advantage over traditional time series models
<br>
### But how can we handle multivariate *dynamic components*?

---

background-image: url('./resources/VAR.svg')
background-size: contain
## VARs &#8680; Granger causality

---

## Latent VAR1s in `mvgam` 📦

```r
varmod <- mvgam(y ~ 1,
*               trend_model = 'VAR1',
                data = data_train,
                newdata = data_test,
                family = gaussian())
```

If multiple series are included in the data, we can use a VAR1 to estimate latent dynamics
- `trend_model = 'VAR1'`: a VAR1 with uncorrelated process errors (off-diagonals in `\(\Sigma\)` set to zero)
- `trend_model = 'VAR1cor'`: a VAR1 with possibly correlated process errors

---

background-image: url('./resources/df_with_series.gif')
## Factors &#8680; induced correlations
---

## Dynamic factors in `mvgam` 📦

```r
dfmod <- mvgam(y ~ 1,
*               use_lv = TRUE,
*               n_lv = 2,
                trend_model = 'AR', 
                data = data_train,
                newdata = data_test,
                family = gaussian())
```

If multiple series are included in the data, we can use a dynamic factor model to estimate latent dynamics
- `n_lv`: the number of latent factors to estimate
- `trend_model`: can be `RW`, `AR1`, `AR2`, `AR3` or `GP`
- factor variances fixed to ensure identifiability

---

## Other uses of `mvgam` 📦
Multiseries models with shared latent states

Uni- and multivariate proper scoring rules for forecast evaluation

Randomised Quantile Residuals to inspect model misspecification

Gaussian Processes to estimate smooth dynamics and time-varying coefficients

User-specified priors for all key parameters

See more [on the package website](https://nicholasjclark.github.io/mvgam/)

---
class: inverse middle center big-subsection
# Thank you

???