Ecological forecasting in R

.title[
# Ecological forecasting in R
]
.subtitle[
## Lecture 2: dynamic GLMs and GAMs
]
.author[
### Nicholas Clark
]
.institute[
### School of Veterinary Science, University of Queensland
]
.date[
### 0900–1200 CET Monday 24th March, 2025
]

---

---

---

## Workflow

Press the "o" key on your keyboard to navigate among slides

Access the [tutorial html here](https://nicholasjclark.github.io/physalia-forecasting-course/day1/tutorial_1_physalia)
- Download the data objects and exercise <svg aria-hidden="true" role="img" viewBox="0 0 581 512" style="height:1em;width:1.13em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:steelblue;overflow:visible;position:relative;"><path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"/></svg> script from the html file
- Complete exercises and use Slack to ask questions

Relevant open-source materials include:
- [An introduction to Bayesian multilevel modeling with `brms`](https://youtu.be/1qeXD4NQ4To)
- [Introduction to Generalized Additive Models with <svg aria-hidden="true" role="img" viewBox="0 0 581 512" style="height:1em;width:1.13em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:steelblue;overflow:visible;position:relative;"><path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"/></svg> and `mgcv`](https://www.youtube.com/watch?v=sgw4cu8hrZM)
- [Time series in R and Stan using `mvgam`](https://www.youtube.com/playlist?list=PLzFHNoUxkCvsFIg6zqogylUfPpaxau_a3)
- [Statistical Rethinking 2023 - 12 - Multilevel Models](https://www.youtube.com/watch?v=iwVqiiXYeC4&list=PLDcUM9US4XdPz-KxHM4XHt7uUVGWWVSus&index=12)

---

## This lecture's topics

Useful probability distributions for ecologists

Generalized Linear and Additive Models

Temporal random effects

Temporal residual correlation structures

---
class: middle center
### When applying statistical modelling to a time series, we aim to estimate parameters for a collection of probability distributions
<br>
### These distributions are indexed by *time* (i.e. the observations are random draws from a set of time-varying distributions)
<br>
### Usually we allow the mean of these distributions to vary over time. But what kinds of distributions are available to us?

---

# Useful probability distributions

---

## Normal (Gaussian)

`$$\boldsymbol{Y_t}\sim \text{Normal}(\mu_t,\sigma)$$`
Properties
- Real-valued continuous observations (including any decimal)
- Unbounded (supports `$-\infty$` to `$\infty$`)
- Symmetric spread, controlled by `$\sigma$`, about the mean `$(\mu_t)$`

Nearly all common time series models assume this data distribution
- RW, AR, and ARIMA
- ETS and TBATS
- Meta's [Prophet 📦](https://facebook.github.io/prophet/)
---

## Normal (Gaussian)

`$$\boldsymbol{Y_t}\sim \text{Normal}(0,2)$$`
<img src="lecture_2_slidedeck_files/figure-html/unnamed-chunk-1-1.png" style="display: block; margin: auto;" />
---

## Normal (Gaussian)

`$$\boldsymbol{Y_t}\sim \text{Normal}(50,20)$$`
<img src="lecture_2_slidedeck_files/figure-html/unnamed-chunk-2-1.png" style="display: block; margin: auto;" />
---
## Linear regression

It is common to estimate linear predictors of `$\mu$` with .emphasize[*regression*]
<br/>
<br/>
`$$\boldsymbol{Y_t}\sim \text{Normal}(\alpha + \beta * \boldsymbol{X_t},\sigma)$$`
<br/>
Where:
- `$\boldsymbol{X_t}$` represents a design matrix of covariates that contribute linearly to variation in `$\mu_t$`
- `$\alpha$` is an intercept coefficient
- `$\beta$` is a vector of regression coefficients
- `$\sigma$` controls the spread of the errors about `$\mu_t$`

---
## ETS(A,A,A) *skip*

Exponential smoothing with additive components for trend, seasonality and error assumes a Normal (Gaussian) distribution
<br/>
<br/>
`$$\boldsymbol{Y}_{t}\sim \text{Normal}({l}_{t-1} + {b}_{t-1} + {s}_{t-m},\sigma)$$`
<br/>
Where: 
- `${l}$` gives the value of the level
- `${b}$` gives the value of the trend
- `${s}$` gives the value of the seasonality
- `${m}$` represents the seasonal period
---

## ARMA(*p*, *q*) *skip*

ARMA processes also assume Normality
<br/>
<br/>
`$$\boldsymbol{Y}_{t}\sim \text{Normal}(c + \sum_{k=1}^{p}\phi_{k}(\boldsymbol{Y}_{t-k}-c)+\sum_{i=1}^{q}\theta_{i}\epsilon_{t-i},\sigma)$$`
<br/>
Where: 
- `$c$` is a constant (drift parameter)
- `${p}$` and `${q}$` gives orders of AR and MA processes
- `${\phi}$` and `${\theta}$` are AR and MA coefficients
- `$\epsilon$` are historical errors (which are `$\text{Normal}(0,\sigma)$`)

---

class: middle center
### But most real-world ecological observations, including time series, *are not Gaussian*

---

<div class="figure" style="text-align: center">
<img src="lecture_2_slidedeck_files/figure-html/unnamed-chunk-3-1.png" alt="Properties of monthly CO2 measurement time series at the South Pole"  />
<p class="caption">Properties of monthly CO2 measurement time series at the South Pole</p>
</div>

---

<div class="figure" style="text-align: center">
<img src="lecture_2_slidedeck_files/figure-html/unnamed-chunk-4-1.png" alt="Properties of lunar monthly Desert Pocket Mouse capture time series from a long-term monitoring study in Portal, Arizona, USA"  />
<p class="caption">Properties of lunar monthly Desert Pocket Mouse capture time series from a long-term monitoring study in Portal, Arizona, USA</p>
</div>

---

<div class="figure" style="text-align: center">
<img src="lecture_2_slidedeck_files/figure-html/unnamed-chunk-5-1.png" alt="Properties of annual American kestrel abundance time series in British Columbia, Canada"  />
<p class="caption">Properties of annual American kestrel abundance time series in British Columbia, Canada</p>
</div>

---

class: middle center
###“If our data contains small counts (0,1,2,...), then we need to use forecasting methods that are more appropriate for a sample space of non-negative integers. 
<br>
### *Such models are beyond the scope of this book*”
  
[Hyndman and Athanasopoulos, Forecasting Principles and Practice](https://otexts.com/fpp3/counts.html)

---
class: black-inverse
.center[.grey[.big[Ok. So now what?]]]
<img src="resources/now_what.gif" style="position:fixed; right:10%; top:20%; width:960px; height:408px; border:none;"/>

---

## Poisson

`$$\boldsymbol{Y_t}\sim \text{Poisson}(\lambda_t)$$`
Properties
- Discrete, integer-valued observations (including `$0$`)
- Lower bound (supports `$0$` to `$\infty$`)
- mean = variance = `$\lambda_t$`

.emphasize[*Virtually no time series models support this distribution*]
- Most analysts use `log` or [`Box-Cox`](https://otexts.com/fpp3/transformations.html) transformation
- But see the [`tscount` 📦](https://cran.r-project.org/web/packages/tscount/vignettes/tsglm.pdf)

---

## Poisson
`$$\boldsymbol{Y_t}\sim \text{Poisson}(3)$$`
<img src="lecture_2_slidedeck_files/figure-html/unnamed-chunk-6-1.png" style="display: block; margin: auto;" />

---

## Poisson
`$$\boldsymbol{Y_t}\sim \text{Poisson}(50)$$`
<img src="lecture_2_slidedeck_files/figure-html/unnamed-chunk-7-1.png" style="display: block; margin: auto;" />

---

### How can we model non-Normal data using regression?

---

# Generalized linear models

Linear regression can't be trusted to give sensible predictions for non-negative count data (or other types of bounded / discrete / non-Normal data)

We can do better by choosing distributions that obey the constraints on our outcome variables

The idea is to .emphasize[*generalize*] the linear regression by replacing parameters from other probability distributions with linear models

This requires a .emphasize[*link function*] that transforms from the unbounded scale of the linear predictor to a scale that is appropriate for the parameters being modeled

---

# Modelling the mean
Most GLMs are used to model the conditional mean `$(\mu_t)$`
<br/>
<br/>
`$$\mathbb{E}(\boldsymbol{Y_t}|\boldsymbol{X_t})=\mu_t=g^{-1}(\alpha+\boldsymbol{X_t}\beta)$$`
<br/>
Where: 
- `$\mathbb{E_t}$` is the *expected value* of `$\boldsymbol{Y_t}$` conditional on `$\boldsymbol{X_t}$`
- `$g^{-1}$` is the *inverse* of the link function
- `$\alpha$` is an intercept coefficient
- `$\beta$` is a vector of regression coefficients

---

# Poisson GLM
A Poisson GLM models the conditional mean with a `$log$` link
<br/>
<br/>
`\begin{align*}
\boldsymbol{Y}_t & \sim \text{Poisson}(\lambda_t) \\
log(\lambda_t) & = \boldsymbol{X}_t \beta \\
& = \alpha + \beta_1 \boldsymbol{x}_{1t} + \beta_2 \boldsymbol{x}_{2t} + \cdots + \beta_j \boldsymbol{x}_{jt}
\end{align*}`

Where: 
- `$\boldsymbol{X}_{t}$` is the matrix of predictor values at time `$t$`
- `$\alpha$` is an intercept coefficient
- `$\beta$` is a vector of regression coefficients
- `$\mathbb{E}(\boldsymbol{Y}_{t}|\boldsymbol{X}_{t})=exp(\alpha+\boldsymbol{X}_{t}\beta)$`

---

# Poisson GLM
A Poisson GLM models the conditional mean with a `$log$` link
<br/>
<br/>
`\begin{align*}
\boldsymbol{Y}_t & \sim \text{Poisson}(\lambda_t) \\
log(\lambda_t) & = \boldsymbol{X}_t \beta \\
& = \color{darkred}{\alpha + \beta_1 \boldsymbol{x}_{1t} + \beta_2 \boldsymbol{x}_{2t} + \cdots + \beta_j \boldsymbol{x}_{jt}}
\end{align*}`

The .emphasize[*linear predictor component can be hugely flexible*], as we will see in later slides

---

### What if our data are proportional instead?

---

<div class="figure" style="text-align: center">
<img src="lecture_2_slidedeck_files/figure-html/unnamed-chunk-8-1.png" alt="Properties of Merriam's kangaroo rat relative abundance time series from a long-term monitoring study in Portal, Arizona, USA"  />
<p class="caption">Properties of Merriam's kangaroo rat relative abundance time series from a long-term monitoring study in Portal, Arizona, USA</p>
</div>

---

# Beta GLM
A Beta GLM models the conditional mean with a `$logit$` link
<br/>
<br/>
`\begin{align*}
\boldsymbol{Y}_t & \sim \text{Beta}(\mu_t,\phi) \\
logit(\mu_t) & = \boldsymbol{X}_t \beta \\
& = \alpha + \beta_1 \boldsymbol{x}_{1t} + \beta_2 \boldsymbol{x}_{2t} + \cdots + \beta_j \boldsymbol{x}_{jt}
\end{align*}`

---

## Some other relevant distributions

[Many other useful GLM probability distributions exist](https://cran.r-project.org/web/packages/brms/vignettes/brms_families.html). Some of these include:
- .emphasize[*Negative Binomial*] &mdash; overdispersed integers in `$(0,1,2,...)$`
- .emphasize[*Bernoulli*] &mdash; presence-absence data in `$\{0,1\}$`
- .emphasize[*Student's T*] &mdash; heavy-tailed (skewed) real values in `$(-\infty, \infty)$` 
- .emphasize[*Lognormal*] &mdash; heavy-tailed (right skewed) real values in `$(0, \infty)$` 
- .emphasize[*Gamma*] &mdash; lighter-tailed (less skewed) real values in `$(0, \infty)$` 
- .emphasize[*Multinomial*] &mdash; integers representing `$K$` unordered categories in `$(0,1,..., K)$`
- .emphasize[*Ordinal*] &mdash; integers representing `$K$` ordered categories in `$(0,1,..., K)$`

---

### GLMs allow us to build models that respect the bounds and distributions of our observed data
<br>
### They traditionally assume the appropriately transformed mean response depends *linearly* on the predictors
<br>
### But there are many other properties we'd like to model

---

## Remember these? 
.grey[Temporal autocorrelation

Lagged effects]

*Measurement error*

*Time-varying effects*

*Nonlinearities*

*Multi-series clustering*]

---

## Remember these? 
.grey[Temporal autocorrelation

Lagged effects

Non-Gaussian data and missing observations

Measurement error

Time-varying effects]

---

---

background-image: url('./lecture_2_slidedeck_files/figure-html/basis-functions-1.svg')
## ... made of basis functions

---

background-image: url('./lecture_2_slidedeck_files/figure-html/basis-functions-weights-1.svg')
## Weighting basis functions ...

---

background-image: url('./resources/basis_weights.gif')
## ... gives a spline `$(f(x))$`

---
## `$k$` &#8680; potential complexity

---

background-image: url('./resources/penalty_spline.gif')
background-size: contain
## Penalize `$f"(x)$` to learn weights

---

background-image: url('./resources/smooth_to_data.gif')
## Penalize `$f"(x)$` to learn weights

---

class: middle center
### GAMs are just fancy GLMs, where some (or all) of the predictor effects are estimated as (possibly nonlinear) smooth functions

---

## GAMs easy to fit in <svg aria-hidden="true" role="img" viewBox="0 0 581 512" style="height:1em;width:1.13em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:steelblue;overflow:visible;position:relative;"><path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"/></svg>
<img align="center" width="306" height="464" src="resources/gam_book.jpg" style="float: right; 
  margin: 10px;">
`$$\mathbb{E}(\boldsymbol{Y_t}|\boldsymbol{X_t})=g^{-1}(\alpha + \sum_{j=1}^{J}f(x_{jt}))$$`

<br/>
Where: 
- `$g^{-1}$` is the *inverse* of the link function
- `${\alpha}$` is the intercept
- `$f(x)$` are potentially nonlinear functions of the `$J$` predictors

---

class: middle center
### But how can GAMs and GLMs be useful for modelling ecological time series?

---
class: inverse middle center big-subsection

# Temporal random effects

---

## Random effects are *hierarchical*
<br>
<br>
<img align="center" width="1000" height="225" src="resources/partial_pool_diagram.png">

---

class: middle center
### Hierarchical models *learn from all groups at once* to inform group-level estimates
<br>
### Induce *regularization*, where noisy estimates are pulled towards the overall mean
<br>
### The regularization is known as [partial pooling](https://www.jstor.org/stable/25471160)

---
background-image: url('./resources/partial_pool.gif')
## Partial pooling in action
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
.small[[McElreath 2023](https://www.youtube.com/watch?v=SocRgsf202M)]

---

## Noisy estimates *pulled* to the mean
<br>
.center[<img align="center" width="768" height="384" src="resources/partial_pool_estimates.png">]

---

## How can they be modelled? 
`\begin{align*}
\boldsymbol{Y}_t & \sim \text{Poisson}(\lambda_t) \\
log(\lambda_t) & = \beta_{year[year_t]} \\
\beta_{year} & \sim \text{Normal}(\color{darkred}{\mu_{year}, \sigma_{year}}) \\
\color{darkred}{\mu_{year}} & \sim \text{Normal}(0, 1) \\
\color{darkred}{\sigma_{year}} & \sim \text{Exponential}(2) 
\end{align*}`

Where we have multiple time points per year, and:
- `${\beta_{year}}$` are yearly intercepts (*one effect per year*)
- `${\color{darkred}{\mu_{year}}}$` estimates *mean effect among all years*
- `${\color{darkred}{\sigma_{year}}}$` estimates *how much effects vary across years*

---

# Live code example

---

## Modelling with the [`mvgam` 📦](https://nicholasjclark.github.io/mvgam/)

Bayesian framework to fit Dynamic GLMs and Dynamic GAMs
- Hierarchical intercepts, slopes *and smooths*
- Latent dynamic processes
- State Space models with measurement error

Built off the [`mgcv` 📦](https://cran.r-project.org/web/packages/mgcv/index.html) to construct penalized smoothing splines

Convenient and familiar <svg aria-hidden="true" role="img" viewBox="0 0 581 512" style="height:1em;width:1.13em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:steelblue;overflow:visible;position:relative;"><path d="M581 226.6C581 119.1 450.9 32 290.5 32S0 119.1 0 226.6C0 322.4 103.3 402 239.4 418.1V480h99.1v-61.5c24.3-2.7 47.6-7.4 69.4-13.9L448 480h112l-67.4-113.7c54.5-35.4 88.4-84.9 88.4-139.7zm-466.8 14.5c0-73.5 98.9-133 220.8-133s211.9 40.7 211.9 133c0 50.1-26.5 85-70.3 106.4-2.4-1.6-4.7-2.9-6.4-3.7-10.2-5.2-27.8-10.5-27.8-10.5s86.6-6.4 86.6-92.7-90.6-87.9-90.6-87.9h-199V361c-74.1-21.5-125.2-67.1-125.2-119.9zm225.1 38.3v-55.6c57.8 0 87.8-6.8 87.8 27.3 0 36.5-38.2 28.3-87.8 28.3zm-.9 72.5H365c10.8 0 18.9 11.7 24 19.2-16.1 1.9-33 2.8-50.6 2.9v-22.1z"/></svg> formula interface

Uni- or multivariate series from a range of response distributions

Uses [Stan](https://mc-stan.org/) for efficient Hamiltonian Monte Carlo sampling

---

## Example of the interface

``` r
model <- mvgam(
  formula = y ~ 
    s(series, bs = 're') + 
    s(x0, series, bs = 're') +
    x1 +
    s(x2, bs = 'tp', k = 5) +
    te(x3, x4, bs = c('cr', 'tp')),
  data = data,
  family = poisson(),
  trend_model = AR(p = 1),
  burnin = 500,
  samples = 500,
  chains = 4)
```

Where `y` = response, `x`'s = covariates, and `series` = a grouping term
---
## Typical formula syntax

``` r
model <- mvgam(
* formula = y ~
*   s(series, bs = 're') +
*   s(x0, series, bs = 're') +
*   x1 +
*   s(x2, bs = 'tp', k = 5) +
*   te(x3, x4, bs = c('cr', 'tp')),
  data = data,
  family = poisson(),
  trend_model = AR(p = 1),
  burnin = 500,
  samples = 500,
  chains = 4)
```

---

## A random intercept effect

``` r
model <- mvgam(
  formula = y ~ 
*   s(series, bs = 're') +
    s(x0, series, bs = 're') +
    x1 + 
    s(x2, bs = 'tp', k = 5) + 
    te(x3, x4, bs = c('cr', 'tp')), 
  data = data,
  family = poisson(),
  trend_model = AR(p = 1),
  burnin = 500,
  samples = 500,
  chains = 4)
```

---

## A random slope effect

``` r
model <- mvgam(
  formula = y ~ 
    s(series, bs = 're') + 
*   s(x0, series, bs = 're') +
    x1 + 
    s(x2, bs = 'tp', k = 5) + 
    te(x3, x4, bs = c('cr', 'tp')), 
  data = data,
  family = poisson(),
  trend_model = AR(p = 1),
  burnin = 500,
  samples = 500,
  chains = 4)
```

---
## A linear parametric effect

``` r
model <- mvgam(
  formula = y ~ 
    s(series, bs = 're') + 
    s(x0, series, bs = 're') + 
*   x1 +
    s(x2, bs = 'tp', k = 5) + 
    te(x3, x4, bs = c('cr', 'tp')), 
  data = data,
  family = poisson(),
  trend_model = AR(p = 1),
  burnin = 500,
  samples = 500,
  chains = 4)
```

---

## A one-dimensional smooth

``` r
model <- mvgam(
  formula = y ~ 
    s(series, bs = 're') + 
    s(x0, series, bs = 're') + 
    x1 + 
*   s(x2, bs = 'tp', k = 5) +
    te(x3, x4, bs = c('cr', 'tp')), 
  data = data,
  family = poisson(),
  trend_model = AR(p = 1),
  burnin = 500,
  samples = 500,
  chains = 4)
```

---

## A two-dimensional smooth

``` r
model <- mvgam(
  formula = y ~ 
    s(series, bs = 're') + 
    s(x0, series, bs = 're') + 
    x1 + 
    s(x2, bs = 'tp', k = 5) + 
*   te(x3, x4, bs = c('cr', 'tp')),
  data = data,
  family = poisson(),
  trend_model = AR(p = 1),
  burnin = 500,
  samples = 500,
  chains = 4)
```

---

## Data and response distribution

``` r
model <- mvgam(
  formula = y ~ 
    s(series, bs = 're') + 
    s(x0, series, bs = 're') + 
    x1 + 
    s(x2, bs = 'tp', k = 5) + 
    te(x3, x4, bs = c('cr', 'tp')), 
* data = data,
* family = poisson(),
  trend_model = AR(p = 1),
  burnin = 500,
  samples = 500,
  chains = 4)
```

---

## <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:darkred;overflow:visible;position:relative;"><path d="M37.6 4.2C28-2.3 15.2-1.1 7 7s-9.4 21-2.8 30.5l112 163.3L16.6 233.2C6.7 236.4 0 245.6 0 256s6.7 19.6 16.6 22.8l103.1 33.4L66.8 412.8c-4.9 9.3-3.2 20.7 4.3 28.1s18.8 9.2 28.1 4.3l100.6-52.9 33.4 103.1c3.2 9.9 12.4 16.6 22.8 16.6s19.6-6.7 22.8-16.6l33.4-103.1 100.6 52.9c9.3 4.9 20.7 3.2 28.1-4.3s9.2-18.8 4.3-28.1L392.3 312.2l103.1-33.4c9.9-3.2 16.6-12.4 16.6-22.8s-6.7-19.6-16.6-22.8L388.9 198.7l25.7-70.4c3.2-8.8 1-18.6-5.6-25.2s-16.4-8.8-25.2-5.6l-70.4 25.7L278.8 16.6C275.6 6.7 266.4 0 256 0s-19.6 6.7-22.8 16.6l-32.3 99.6L37.6 4.2z"/></svg> latent dynamics

``` r
model <- mvgam(
  formula = y ~ 
    s(series, bs = 're') + 
    s(x0, series, bs = 're') + 
    x1 + 
    s(x2, bs = 'tp', k = 5) + 
    te(x3, x4, bs = c('cr', 'tp')), 
  data = data,
  family = poisson(), 
* trend_model = AR(p = 1),
  burnin = 500,
  samples = 500,
  chains = 4)
```

---

## Sampler parameters

``` r
model <- mvgam(
  formula = y ~ 
    s(series, bs = 're') + 
    s(x0, series, bs = 're') + 
    x1 + 
    s(x2, bs = 'tp', k = 5) + 
    te(x3, x4, bs = c('cr', 'tp')), 
  data = data,
  family = poisson(), 
  trend_model = AR(p = 1),
* burnin = 500,
* samples = 500,
  chains = 4)
```

---

## Example data (`tidy` format)

---

## Response (`NA`s allowed)

---

## Series indicator (as `factor`)

---

## Time indicator

---

## Any other predictors

<table class=" lightable-minimal" style='color: black; font-family: "Trebuchet MS", verdana, sans-serif; width: auto !important; margin-left: auto; margin-right: auto;'>
 <thead>
  <tr>
   <th style="text-align:right;"> y </th>
   <th style="text-align:left;"> series </th>
   <th style="text-align:right;"> time </th>
   <th style="text-align:right;"> x0 </th>
   <th style="text-align:left;"> x1 </th>
   <th style="text-align:right;"> x2 </th>
   <th style="text-align:right;"> x3 </th>
   <th style="text-align:right;"> x4 </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:left;"> species_1 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> -0.97 </td>
   <td style="text-align:left;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> A </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> 2.61 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> 0.47 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> 0.36 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:left;"> species_2 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> 0.88 </td>
   <td style="text-align:left;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> A </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> -1.63 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> 0.87 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> 0.84 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:left;"> species_3 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> -0.25 </td>
   <td style="text-align:left;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> B </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> -1.61 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> -0.89 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> -0.68 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:left;"> species_4 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> -1.52 </td>
   <td style="text-align:left;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> B </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> 0.34 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> -0.13 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> -0.39 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> NA </td>
   <td style="text-align:left;"> species_1 </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> -0.02 </td>
   <td style="text-align:left;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> A </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> 2.73 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> 1.10 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> 0.40 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> NA </td>
   <td style="text-align:left;"> species_2 </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> 0.03 </td>
   <td style="text-align:left;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> A </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> -0.33 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> 0.52 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> -1.11 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:left;"> species_3 </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> 0.16 </td>
   <td style="text-align:left;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> B </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> -1.26 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> 0.40 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> -0.38 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:left;"> species_4 </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> 0.41 </td>
   <td style="text-align:left;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> B </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> -0.43 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> 1.07 </td>
   <td style="text-align:right;font-weight: bold;background-color: rgba(81, 36, 122, 32) !important;"> 0.29 </td>
  </tr>
</tbody>
</table>

---
class: inverse middle center big-subsection

# Examples

---

background-image: url('./resources/pp_image.jpg')
background-size: cover
background-color: #77654E

---

## The data structure
<br>

``` r
dplyr::glimpse(model_data)
```

```
## Rows: 80
## Columns: 7
## $ series      <fct> PP, PP, PP, PP, PP, PP, PP, PP, PP, PP, PP, PP, PP, PP, PP…
## $ year        <fct> 2004, 2004, 2004, 2004, 2004, 2004, 2004, 2004, 2004, 2004…
## $ time        <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,…
## $ count       <int> 0, NA, 0, 1, 7, 7, 8, 8, 4, NA, 0, 0, 0, 0, 0, 0, NA, 2, 4…
## $ mintemp     <dbl> -0.79633807, -1.33471597, -1.24166462, -1.08048145, -0.424…
## $ ndvi_ma12   <dbl> -0.172144125, -0.237363477, -0.212120638, -0.160438125, -0…
## $ year_factor <fct> 2004, 2004, 2004, 2004, 2004, 2004, 2004, 2004, 2004, 2004…
```

---
## The observations
.panelset[
.panel[.panel-name[Code]

``` r
# use mvgam's plot utility to view properties of the observations
plot_mvgam_series(data = model_data, y = 'count')
```
]

]
]
---

## Yearly heterogeneity
<img src="lecture_2_slidedeck_files/figure-html/unnamed-chunk-27-1.png" style="display: block; margin: auto;" />

---
## Yearly random intercepts

``` r
year_random <- mvgam(
  count ~ 
*   s(year, bs = 're') - 1,
  family = poisson(),
  data = model_data,
  trend_model = 'None'
)
```

Random effect basis in `mgcv` language

Global intercept suppressed

---

## Estimated yearly intercepts
.panelset[
.panel[.panel-name[Code]

``` r
# plot the random effect posterior estimates
plot_predictions(year_random, 
                 condition = 'year',
                 type = 'link')
```
]

]
]

---
## Population parameters

``` r
# use bayesplot utilities to plot parameter estimates
mcmc_plot(year_random, 
          type = 'areas',
          variable = c('mean(year)', 'sd(year)'))
```

]

.panel[.panel-name[Plot]
.center[![](lecture_2_slidedeck_files/figure-html/year_random_mcmcplot-1.svg)]

]
]

---

## Hindcast predictions
.panelset[
.panel[.panel-name[Code]

``` r
# use mvgam's plot to view hindcast predictions
plot(hindcast(year_random))
```
]

]
]
---

## `mvgam` with yearly smooth

``` r
model_data %>%
  dplyr::mutate(year = as.numeric(as.character(year))) -> model_data

year_smooth <- mvgam(
  count ~ 
*   s(year, bs = 'tp', k = 7),
  family = poisson(),
  data = model_data,
  trend_model = 'None',
)
```

A thin plate regression spline of the numeric `year` variable

Retain intercept because smooths are zero-centered
---

## Coefficients uninterpretable
<br>

``` r
rownames(coef(year_smooth))
```

```
## [1] "(Intercept)" "s(year).1"   "s(year).2"   "s(year).3"   "s(year).4"  
## [6] "s(year).5"   "s(year).6"
```

We .emphasize[*must*] use predictions and plots to understand the model

---

## Estimated yearly smooth
.panelset[
.panel[.panel-name[Code]

``` r
# plot the smooth effect posterior estimates
plot_predictions(year_smooth,
                 condition = 'year',
                 type = 'link')
```
]

]
]

---
## Conditional predictions
.panelset[
.panel[.panel-name[Code]

``` r
# use marginaleffects utilities to plot conditional predictions
plot_predictions(year_smooth, 
                 condition = 'year',
                 points = 0.5)
```

]

]
]
---

## Hindcast predictions
.panelset[
.panel[.panel-name[Code]

``` r
# use mvgam's plot to view hindcast predictions
plot(hindcast(year_smooth))
```
]

]
]
---
class: middle center
### Forecasts will differ. Why?
<br>
### We will explore this further in the tutorial and in the next lecture
<br>
### But how do model diagnostics look?

---

## Random year diagnostics
<img src="lecture_2_slidedeck_files/figure-html/unnamed-chunk-38-1.png" style="display: block; margin: auto;" />

---

## Smooth year diagnostics
<img src="lecture_2_slidedeck_files/figure-html/unnamed-chunk-39-1.png" style="display: block; margin: auto;" />

---

class: middle center
### Randomized quantile residuals show evidence of unmodelled autocorrelation and seasonality
<br>
### How can we deal with the seasonality?

---

## Adding a smooth of mintemp

``` r
year_temp_smooth <- mvgam(
  count ~ 
    s(year, bs = 'tp', k = 7) +
*   s(mintemp, bs = 'tp', k = 7),
  family = poisson(),
  data = model_data,
  trend_model = 'None'
)
```

A thin plate regression spline of  `mintemp`
---

## Estimated smooths
.panelset[
.panel[.panel-name[Code]

``` r
# use gratia's draw() to view both smooth functions
library(gratia)
draw(year_temp_smooth)
```
]

.panel[.panel-name[Plot]
.center[![](lecture_2_slidedeck_files/figure-html/year_temp_smooth_est-1.svg)]

]
]
---

## Diagnostics
<img src="lecture_2_slidedeck_files/figure-html/unnamed-chunk-43-1.png" style="display: block; margin: auto;" />

---

class: middle center
### Randomized quantile residuals still show evidence of unmodelled autocorrelation
<br>
### How can we deal with this?

---
## A smooth of time

``` r
temp_time_smooth <- mvgam(
  count ~ 
    s(mintemp, bs = 'tp', k = 7) +
*   s(time, bs = 'tp', k = 50),
  family = poisson(),
  data = model_data
)
```

Replace the spline of `year` with a complex spline of `time` to capture autocorrelation
---

## Updated smooths
<img src="lecture_2_slidedeck_files/figure-html/unnamed-chunk-46-1.png" style="display: block; margin: auto;" />

---

## Diagnostics
<img src="lecture_2_slidedeck_files/figure-html/unnamed-chunk-47-1.png" style="display: block; margin: auto;" />

---

## Hindcast predictions <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:currentColor;overflow:visible;position:relative;"><path d="M464 256A208 208 0 1 0 48 256a208 208 0 1 0 416 0zM0 256a256 256 0 1 1 512 0A256 256 0 1 1 0 256zm177.6 62.1C192.8 334.5 218.8 352 256 352s63.2-17.5 78.4-33.9c9-9.7 24.2-10.4 33.9-1.4s10.4 24.2 1.4 33.9c-22 23.8-60 49.4-113.6 49.4s-91.7-25.5-113.6-49.4c-9-9.7-8.4-24.9 1.4-33.9s24.9-8.4 33.9 1.4zm40-89.3l0 0 0 0-.2-.2c-.2-.2-.4-.5-.7-.9c-.6-.8-1.6-2-2.8-3.4c-2.5-2.8-6-6.6-10.2-10.3c-8.8-7.8-18.8-14-27.7-14s-18.9 6.2-27.7 14c-4.2 3.7-7.7 7.5-10.2 10.3c-1.2 1.4-2.2 2.6-2.8 3.4c-.3 .4-.6 .7-.7 .9l-.2 .2 0 0 0 0 0 0c-2.1 2.8-5.7 3.9-8.9 2.8s-5.5-4.1-5.5-7.6c0-17.9 6.7-35.6 16.6-48.8c9.8-13 23.9-23.2 39.4-23.2s29.6 10.2 39.4 23.2c9.9 13.2 16.6 30.9 16.6 48.8c0 3.4-2.2 6.5-5.5 7.6s-6.9 0-8.9-2.8l0 0 0 0zm160 0l0 0-.2-.2c-.2-.2-.4-.5-.7-.9c-.6-.8-1.6-2-2.8-3.4c-2.5-2.8-6-6.6-10.2-10.3c-8.8-7.8-18.8-14-27.7-14s-18.9 6.2-27.7 14c-4.2 3.7-7.7 7.5-10.2 10.3c-1.2 1.4-2.2 2.6-2.8 3.4c-.3 .4-.6 .7-.7 .9l-.2 .2 0 0 0 0 0 0c-2.1 2.8-5.7 3.9-8.9 2.8s-5.5-4.1-5.5-7.6c0-17.9 6.7-35.6 16.6-48.8c9.8-13 23.9-23.2 39.4-23.2s29.6 10.2 39.4 23.2c9.9 13.2 16.6 30.9 16.6 48.8c0 3.4-2.2 6.5-5.5 7.6s-6.9 0-8.9-2.8l0 0 0 0 0 0z"/></svg>
<img src="lecture_2_slidedeck_files/figure-html/unnamed-chunk-48-1.png" style="display: block; margin: auto;" />

---

class: middle center
### Using an *additive* combination of smooth functions, we have captured a lot of the variation in the observed data
<br>
### But we are dealing with a time series, so we'd like our model to generate sensible forecast predictions
<br>
### As we'll see in the next lecture, this one has some problems
---

## In the next lecture, we will cover

Extrapolating splines

Latent autoregressive processes

Latent Gaussian Processes

Dynamic coefficient models