The data and modelling objects created in this notebook can be downloaded directly to save computational time.
Users who wish to complete the exercises can download a small
template R
script. Assuming you have already downloaded the
data objects above, this script will load all data objects so that the
steps used to create them are not necessary to tackle the exercises.
This tutorial relates to content covered in Lecture 4, and relies on the following packages for manipulating data, shaping time series, fitting dynamic regression models and plotting:
library(dplyr)
library(mvgam)
library(marginaleffects)
library(tidybayes)
library(ggplot2); theme_set(theme_classic())
The data we will use for this tutorial is spawner-recruit data
for sockeye salmon (Oncorhynchus nerka) from the Bristol Bay
region in Southwest Alaska. Commercial fishermen in Alaska target this
important species using seines and gillnets for fresh or frozen fillet
sales and canning. Records for the rivers in Bristol Bay span the years
1952-2005, and are maintained within the RAM Legacy Stock
Assessment Database. Spawner estimates originated from Bristol Bay
Area Annual Management Reports from the Alaska Department of Fish and
Game, and are based on in-river spawner counts (including both tower
counts and aerial surveys). Age-specific catch rates were calculated
from records of catch and escapement according to age class. These data
were used to determine the number of recruits in each year.
Below we process the data, keeping observations for a single river
and taking \(log(spawners_{t})\) to use
as the offset in our models. We also retain two possibly important
covariates, pdo_winter_t2
and pdo_summer_t2
.
These variables, representing the Pacific Decadal Oscillation (PDO)
during the salmon’s first year in the ocean (i.e. 2 years prior to
sampling), are widely believed to reflect “bottlenecks” to salmon
survival in their first year.
# access the data from the ATSA Github site
load(url('https://github.com/atsa-es/atsalibrary/raw/master/data/sockeye.RData'))
sockeye %>%
# Remove any previous grouping structures
dplyr::ungroup() %>%
# select the target river
dplyr::filter(region == "KVICHAK") %>%
# filter the data to only include years when number of spawners
# was estimated, as this will be used as the offset
dplyr::filter(!is.na(spawners)) %>%
# log the spawners variable to use as an appropriate offset
dplyr::mutate(log_spawners = log(spawners)) %>%
# add a 'time' variable
dplyr::mutate(time = dplyr::row_number()) %>%
# add a series variable (as a factor)
dplyr::mutate(series = as.factor('sockeye')) %>%
# keep variables of interest
dplyr::select(time, series, log_spawners,
recruits, pdo_summer_t2, pdo_winter_t2) -> model_data
The data now contain four variables: time
, the indicator
of which time step each observation belongs to series
, a
factor indexing which time series each observation belongs to
log_spawners
, the logged number of recorded sockeye
spawners (in thousands), measured as the number of returning spawners
minus catch at each timepoint, which will be used as an offset in models
recruits
, the response variable representing the number of
estimated sockeye recruits at each timepoint (in thousands)
Check the data structure
head(model_data)
## # A tibble: 6 × 6
## time series log_spawners recruits pdo_summer_t2 pdo_winter_t2
## <int> <fct> <dbl> <dbl> <dbl> <dbl>
## 1 1 sockeye 9.15 39000 -0.5 -0.31
## 2 2 sockeye 7.95 4090 -2.36 -1.78
## 3 3 sockeye 6.28 289 -1.48 -2.14
## 4 4 sockeye 6.52 547 0.58 0
## 5 5 sockeye 9.59 55300 1.05 0.68
## 6 6 sockeye 8.22 3500 -0.22 0.77
dplyr::glimpse(model_data)
## Rows: 50
## Columns: 6
## $ time <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 1…
## $ series <fct> sockeye, sockeye, sockeye, sockeye, sockeye, sockeye, so…
## $ log_spawners <dbl> 9.152711, 7.951559, 6.282267, 6.522093, 9.588777, 8.2187…
## $ recruits <dbl> 39000, 4090, 289, 547, 55300, 3500, 5360, 1120, 5750, 45…
## $ pdo_summer_t2 <dbl> -0.50, -2.36, -1.48, 0.58, 1.05, -0.22, 0.01, -0.73, -1.…
## $ pdo_winter_t2 <dbl> -0.31, -1.78, -2.14, 0.00, 0.68, 0.77, 0.07, -0.42, -0.8…
Visualize the data as a heatmap to get a sense of where any
NA
s are distributed (NA
s are shown as red bars
in the below plot)
image(
is.na(t(model_data %>%
dplyr::arrange(dplyr::desc(time)))), axes = F,
col = c('grey80', 'darkred')
)
axis(3,
at = seq(0,1, len = NCOL(model_data)),
labels = colnames(model_data))
Plot properties of the time series, which indicate some
overdispersion in the recruit count values that we’ll have to address
mvgam
We will fit a number of models to these data to showcase the various methods that are available to us for evaluating and comparing dynamic GLMs / GAMs. We begin with a model that does not include a dynamic component, as this is always a useful starting point. It also allows us to highlight another useful feature of GLMs / GAMs when modelling count data, as we can include an offset variable that allows us to model rates of interest.
Sometimes it is more relevant to model the rate an event occurs per
some underlying unit. For example, in epidemiological modelling the
interest is often in understanding the number of disease cases per
population. But we never get to observe this rate directly. An
offset allows us to model the underlying rate by algebraically accounting for variation in the number of
possible counts in each observation unit. This works naturally in
models that use a log link function, such as the Poisson and Negative
Binomial models. Including an offset is usually straightforward in most
R
modelling software, and indeed this is the case in
mvgam
as well. Here we fit a model with no trend and
Poisson observations, including an offset of the
log_spawners
variable that we calculated above so we can
estimate the rate of recruitment per spawner. Our first model
includes smooth functions of pdo_winter_t2
and
pdo_summer_t2
(including an interaction among the two),
along with an offset of log_spawners
in the linear
predictor. As in previous tutorials, only 250 samples per chain are used
to keep the size of the resulting object small (though we’d always want
to use more samples in a real analysis). Note that this is not a classic
density-dependent stock-recruitment model such as the Ricker or Beverton Holt models. Rather, this model
very simplistically assumes recruits can increase without bound as a
function of the number of spawners.
model1 <- mvgam(
recruits ~ offset(log_spawners) +
s(pdo_winter_t2,
k = 3,
bs = 'cr') +
s(pdo_summer_t2,
k = 3,
bs = 'cr') +
ti(pdo_winter_t2,
pdo_summer_t2,
k = 3,
bs = 'cr'),
family = poisson(),
trend_model = 'None',
data = model_data,
samples = 250
)
The model summary is shown below, which indicates there are no major sampling issues to worry about. It also suggests that the interaction among the two covariates is highly wiggly and important in the model (as are the main effects)
summary(model1)
## GAM formula:
## recruits ~ offset(log_spawners) + s(pdo_winter_t2, k = 3, bs = "cr") +
## s(pdo_summer_t2, k = 3, bs = "cr") + ti(pdo_winter_t2, pdo_summer_t2,
## k = 3, bs = "cr")
##
## Family:
## poisson
##
## Link function:
## log
##
## Trend model:
## None
##
## N series:
## 1
##
## N timepoints:
## 50
##
## Status:
## Fitted using Stan
## 4 chains, each with iter = 1000; warmup = 750; thin = 1
## Total post-warmup draws = 1000
##
##
## GAM coefficient (beta) estimates:
## 2.5% 50% 97.5% Rhat n_eff
## (Intercept) -1.6 -1.5 -1.5 1 298
## s(pdo_winter_t2).1 6.1 6.2 6.3 1 374
## s(pdo_winter_t2).2 -1.6 -1.6 -1.5 1 549
## s(pdo_summer_t2).1 5.0 5.1 5.2 1 345
## s(pdo_summer_t2).2 -5.8 -5.7 -5.7 1 397
## ti(pdo_winter_t2,pdo_summer_t2).1 5.4 5.5 5.6 1 428
## ti(pdo_winter_t2,pdo_summer_t2).2 27.0 27.0 28.0 1 380
## ti(pdo_winter_t2,pdo_summer_t2).3 12.0 12.0 13.0 1 474
## ti(pdo_winter_t2,pdo_summer_t2).4 16.0 16.0 16.0 1 308
##
## Approximate significance of GAM smooths:
## edf Ref.df Chi.sq p-value
## s(pdo_winter_t2) 1.686 2 18.3 <2e-16 ***
## s(pdo_summer_t2) 0.984 2 26.6 <2e-16 ***
## ti(pdo_winter_t2,pdo_summer_t2) 3.516 4 28.0 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Stan MCMC diagnostics:
## ✔ No issues with effective samples per iteration
## ✔ Rhat looks good for all parameters
## ✔ No issues with divergences
## ✔ No issues with maximum tree depth
##
## Samples were drawn using sampling(hmc). For each parameter, n_eff is a
## crude measure of effective sample size, and Rhat is the potential scale
## reduction factor on split MCMC chains (at convergence, Rhat = 1)
##
## Use how_to_cite() to get started describing this model
The inclusion of the offset is important if we wish to make
predictions on the scale of the observations. This is because there will
be no regression coefficient assigned to this variable (it’s coefficient
is fixed at 1). If you wish to manually make predictions, you’ll need to
take care to include the offset. This is returned as an attribute of the
linear predictor matrix when we call predict.gam
on the
underlying mgcv
model:
lpmatrix <- predict(model1$mgcv_model, type = 'lpmatrix')
attr(lpmatrix, 'model.offset')
## [1] 9.152711 7.951559 6.282267 6.522093 9.588777 8.218787 7.855545
## [8] 5.826000 6.863803 10.098232 8.237479 8.077137 7.847763 9.034796
## [15] 9.539644 7.779049 6.917706 5.424950 8.396155 9.480368 7.585789
## [22] 7.200425 8.330864 9.323669 9.769956 7.467371 7.038784 8.180321
## [29] 9.259131 8.883224 7.073270 8.711114 8.311398 9.026418 8.849371
## [36] 8.347590 8.461680 8.301522 9.028818 9.210340 7.279319 7.313220
## [43] 7.740664 8.732305 7.512071 7.003065 6.556778 7.432484 8.612503
## [50] 7.749322
mvgam
makes this process straightforward using the
predict.mvgam()
, which is then used by many other functions
in the package including plot_predictions()
,
draw()
, hindcast()
and
forecast()
. It is also used by
conditional_effects()
, which is one of the first functions
you will likely use after fitting a model in mvgam
. This
function acts as a wrapper to the more flexible
plot_predictions()
and can give a quick overview of the
“main” effects in a model’s regression formulae. It displays all
combinations up to two-way interactions. See
?conditional_effects.mvgam
for guidance.
conditional_effects(
model1,
type = 'link'
)
I usually recommend you use type = 'link'
or
type = 'expected'
when plotting these, as the default
type = 'response'
can sometimes make it hard to see
patterns for certain response families. But because we used a Poisson
family, the patterns won’t change too much between the default
type = 'response'
and type = 'expected'
(apart
from wider uncertainty bands):
conditional_effects(
model1,
type = 'expected'
)
conditional_effects(
model1,
type = 'response'
)
Although conditional_effects()
is great for getting a
quick overview, you will often need to move to
plot_predictions()
, plot_slopes()
, and various
comparisons()
functions so that you can get the key
combinations of predictor values that you are after. For example, we can
plot the conditional effects of the two covariates using
plot_predictions()
. This function will internally call the
datagrid()
function from marginaleffects
to
generate a data grid of user-specified values to be supplied as
newdata
to mvgam
’s predict()
function. By defauls, any other numeric covariates will be set to
0
, and any other factor variables will be set to their
first levels. To get a sense of what this looks like, let’s generate a
few grids. First, we will make a grid using the minimum and maximum
values of pdo_winter_t2
. Notice how all other values in the
grid are identical across the rows of the grid:
datagrid(
pdo_winter_t2 = range,
model = model1
)
## log_spawners pdo_summer_t2 time series pdo_winter_t2 rowid
## 1 8.067752 -0.3032 26 sockeye -2.14 1
## 2 8.067752 -0.3032 26 sockeye 1.43 2
We can make this grid as simple or as fancy as we want. Here we will
provide a set sequence of values for pdo_winter_t2
:
datagrid(
pdo_winter_t2 = c(-2, -1, 0, 1, 2),
model = model1
)
## log_spawners pdo_summer_t2 time series pdo_winter_t2 rowid
## 1 8.067752 -0.3032 26 sockeye -2 1
## 2 8.067752 -0.3032 26 sockeye -1 2
## 3 8.067752 -0.3032 26 sockeye 0 3
## 4 8.067752 -0.3032 26 sockeye 1 4
## 5 8.067752 -0.3032 26 sockeye 2 5
This same logic can easily be extended to multiple predictors:
datagrid(
pdo_winter_t2 = c(-2, -1, 0, 1, 2),
pdo_summer_t2 = c(-3, 3),
model = model1
)
## log_spawners time series pdo_winter_t2 pdo_summer_t2 rowid
## 1 8.067752 26 sockeye -2 -3 1
## 2 8.067752 26 sockeye -2 3 2
## 3 8.067752 26 sockeye -1 -3 3
## 4 8.067752 26 sockeye -1 3 4
## 5 8.067752 26 sockeye 0 -3 5
## 6 8.067752 26 sockeye 0 3 6
## 7 8.067752 26 sockeye 1 -3 7
## 8 8.067752 26 sockeye 1 3 8
## 9 8.067752 26 sockeye 2 -3 9
## 10 8.067752 26 sockeye 2 3 10
In the above, you can see how we get a grid with every possible
combination of predictors. See
?marginaleffects::datagrid
for details of some of the other
things you can do when making these fake data grids. When we call
plot_predictions()
and pass variables to the
condition
or by
arguments, a
datagrid
is being created internally to automatically allow
us to inspect the kinds of effects we’d like to look at. For example,
supplying one of the two covariates to the condition
argument will set up a grid using a fine sequence of fake values for the
covariate of interest, while setting the other covariate to zero. This
allows us to produce a smooth plot of predictions across the range of
observed values for the specified covariate:
plot_predictions(
model1,
condition = 'pdo_winter_t2',
type = 'expected'
)
plot_predictions(
model1,
condition = 'pdo_summer_t2',
type = 'expected'
)
Supplying more than one covariate to the condition
argument will set up more in-depth visualisations:
plot_predictions(
model1,
condition = c('pdo_summer_t2',
'pdo_winter_t2'),
type = 'expected'
)
Alternatively, we can supply covariates to the by
argument. This will only used those values that were actually observed
in the original data, rather than making a sequence of fake values. This
can be very handy for looking at hindcast distributions from some of our
models by passing time
to the by
argument.
Adding a non-zero value for the points
argument will show
the observed data points
plot_predictions(
model1,
by = 'time',
points = 0.5
)
The plot_slopes()
function is also highly useful, in
this case for understanding how a nonlinear effect changes over the
space of a covariate (i.e. what is the first derivative of the function
over a grid of covariate values).
plot_slopes(
model1,
variables = 'pdo_summer_t2',
condition = 'pdo_summer_t2',
type = 'link'
) +
geom_hline(yintercept = 0,
linetype = 'dashed')
There are many more extended uses of predict.mvgam()
.
For example, we can compute how many more recruits this model
would expect to see if there were 5,000 spawners recorded vs if there
were only 3,000 spawners recorded. We will full advantage of the
avg_comparisons()
function from
marginaleffects
to do this. As with
plot_predictions()
, this function will internally set up a
datagrid
that will be passed to
predict.mvgam()
to capture exactly what we are after.
Specifically, this datagrid
will fix the covariates to zero
so that the two scenarios (3,000 spawners vs 5,000 spawners) are
directly comparable. It will then make a comparison between
these predictions, and it will do this many times using draws from the
model’s Bayesian posterior distribution. The default comparison is to
take the difference between the two scenarios. See
?marginaleffects::avg_comparisons
for details.
# take draws from the model that compute the average comparison between
# spawners recorded at 5,000 vs 3,000 (be sure to log first!)
post_contrasts <- avg_comparisons(
model1,
variables = list(log_spawners = log(c(3000, 5000))),
type = 'expected'
) %>%
posterior_draws()
head(post_contrasts)
## drawid draw term contrast estimate
## 1 1 4438.660 log_spawners 8.51719319141624 - 8.00636756765025 4430.429
## 2 2 4433.804 log_spawners 8.51719319141624 - 8.00636756765025 4430.429
## 3 3 4435.292 log_spawners 8.51719319141624 - 8.00636756765025 4430.429
## 4 4 4444.133 log_spawners 8.51719319141624 - 8.00636756765025 4430.429
## 5 5 4427.345 log_spawners 8.51719319141624 - 8.00636756765025 4430.429
## 6 6 4431.674 log_spawners 8.51719319141624 - 8.00636756765025 4430.429
## conf.low conf.high predicted_lo predicted_hi predicted tmp_idx
## 1 4415.562 4445.988 7505.689 12509.48 23617.9 1
## 2 4415.562 4445.988 7505.689 12509.48 23617.9 1
## 3 4415.562 4445.988 7505.689 12509.48 23617.9 1
## 4 4415.562 4445.988 7505.689 12509.48 23617.9 1
## 5 4415.562 4445.988 7505.689 12509.48 23617.9 1
## 6 4415.562 4445.988 7505.689 12509.48 23617.9 1
Each row of the returned object gives us one posterior draw of this
comparison. We can plot the distribution of these comparisons using the
tidybayes
package:
# use the resulting posterior draw object to plot a density of the
# posterior contrasts
post_contrasts %>%
ggplot(aes(x = draw)) +
# use the stat_halfeye function from tidybayes for a nice visual
stat_halfeye(fill = "#C79999") +
labs(
x = "(spawners = 5,000) − (spawners = 3,000)",
y = "Density",
title = "Average posterior contrast"
) +
theme_classic()
Posterior predictive checks are useful for understanding how well
your model can capture particular features of your observed data. In
mvgam
, the pp_check()
function is particularly
helpful as it relies on the very broad range of checking functions that
are available in bayesplot
. For example, we can plot a
density kernel of our observed counts and overlay simulated kernels from
our model. These simulated kernels are generated using
predict.mvgam()
, so they are informative for all
mvgam
models (regardless of whether or not the model
included dynamic components).
pp_check(
model1,
type = 'dens_overlay',
ndraws = 100
)
This model is not really capturing the shape of the observed data. We can confirm this behaviour by comparing observed vs simulated CDFs. What else might we need to do to improve this model?
pp_check(
model1,
type = 'ecdf_overlay',
ndraws = 100
)
There are many other types of posterior predictive checks that can be
implemented, and it is highly recommended that you design some to suit
your own specific needs (i.e., what aspect(s) of the data are you most
concerned with your model being able to adequately capture?). See
?mvgam::ppc_check
for details of other kinds of plots you
can make.
Of course we should look more closely at the residuals for any unmodelled temporal autocorrelation
plot(
model1,
type = 'residuals'
)
Back to the modelling strategy. There are still some patterns in the residuals, including substantial autocorrelation and an apparent mis-specification of the dispersion in the data. We need a dynamic model to capture this autocorrelation, so we will add an AR1 model for the residuals. But we also need to think about the dispersion in the data. A Negative Binomial may be better here, as it can handle count values that distribute beyond what a Poisson can handle.
model2 <- mvgam(
recruits ~
offset(log_spawners) +
s(pdo_winter_t2,
k = 3,
bs = 'cr') +
s(pdo_summer_t2,
k = 3,
bs = 'cr') +
ti(pdo_winter_t2,
pdo_summer_t2,
k = 3,
bs = 'cr'),,
family = nb(),
trend_model = AR(),
noncentred = TRUE,
data = model_data
)
Because this model has a dynamic trend component, it is worth quickly
revisiting the differences between the predict.mvgam()
function and the hindcast.mvgam()
/
forecast.mvgam()
functions. First, we investigate how
ignoring the uncertainty in the trend component changes the predictions.
In this step, we calculate predictions for the observed time period in
two ways: first we average over the uncertainty from the latent trend
(using process_error = TRUE
). This will essentially give us
a prediction where we include the random effect of the latent process
model, but assume that we have a different possible sample from that
random effect distribution at each time point. Second, we use
process_error = FALSE
to give predictions that essentially
ignore the random effects from the latent process. Plotting both of
these together shows how our prediction uncertainty changes when we
include the uncertainty in the dynamic trend component. A crucial detail
to understand here is that the predict.mvgam()
only uses
the covariates from the model formulae, so in this case it does not
account for the actual ‘times’ of the data. Rather it assumes the
dynamic process has reached stationarity and simply samples from the
following distribution for the trend: \(z_t
\sim Normal(0, \sigma_{error})\)
# calculate predictions for the training data by including
# the dynamic process uncertainty without estimating the actual
# latent states
ypreds <- predict(
model2,
type = 'response',
process_error = TRUE,
summary = FALSE
)
# calculate 95% empirical quantiles of predictions
cred <- sapply(
1:NCOL(ypreds),
function(n)
quantile(ypreds[,n],
probs = c(0.025, 0.975),
na.rm = T)
)
pred_vals <- 1:NCOL(ypreds)
# graphical plotting parameters
layout(matrix(1:2, ncol = 2))
par(mar = c(2.5, 2.3, 2, 2),
oma = c(1, 1, 0, 0),
mgp = c(1.5, 0.5, 0))
# plot credible intervals
plot(
1,
type = "n",
bty = 'L',
xlab = 'Time',
ylab = 'predict() predictions',
xlim = c(min(pred_vals),
max(pred_vals)),
ylim = c(min(cred, na.rm = T),
max(cred, na.rm = T))
)
polygon(
c(pred_vals, rev(pred_vals)),
c(cred[1,], rev(cred[2,])),
col = 'darkred',
border = NA
)
# now repeat, but this time ignore uncertainty in the dynamic component
ypreds2 <- predict(
model2,
type = 'response',
process_error = FALSE,
summary = FALSE
)
cred2 <- sapply(
1:NCOL(ypreds),
function(n)
quantile(ypreds2[,n],
probs = c(0.025, 0.975),
na.rm = T)
)
# overlay these predictions as a lighter colour
polygon(
c(pred_vals, rev(pred_vals)),
c(cred2[1,], rev(cred2[2,])),
col = "#DCBCBC",
border = NA
)
# overlay the actual observations as well
points(
model_data$recruits,
pch = 16,
cex = 0.8,
col = 'white'
)
points(
model_data$recruits,
pch = 16, cex = 0.65
)
box(bty = 'l', lwd = 2)
# now plot the actual hindcasts, which did not assume the process was
# stationary but instead modelled the latent states over the training time.
# use the same y-axis limits to see how these plots differ
plot_mvgam_fc(
model2,
ylim = c(min(cred, na.rm = T),
max(cred, na.rm = T)),
ylab = 'hindcast() predictions'
)
layout(1)
As you can see, the predictions differ. The
plot_mvgam_fc()
function, which uses the same prediction
steps as hindcast()
and forecast()
functions,
uses the actual estimates of the latent temporal states. These
predictions will nearly always “fit the data better”, but they might not
be as useful for scenario-planning. Sometimes we’d rather get a sense of
how predictions might change on average if some external
conditions changed, for example if temperature were to increase by x
degrees or if we were to sample from a new group for a random effect.
For exploring these types of scenarios, the prediction-based functions
(i.e. predict()
, plot_predictions()
,
conditional_effects()
, pp_check()
etc…) are
more useful because they do not use the actual estimates of the
latent states.
model1
’s posterior intercept
coefficients \(\alpha\). Use
?mvgam::as.data.frame.mvgam
for guidancerecruits
vs
log_spawners
using the supplies model_data
.
Take a few notes on whether you think our primary modelling assumption,
that the number of log(recruits
) scales linearly with
number of log_spawners
, is justified.# Replace the ? with the correct value(s)
# check names of variables that can be extracted
vars_available <- variables(model1)
# we are interested in the observation model coefficients
vars_available$observation_betas
# extract the intercepts using the as.data.frame.mvgam function
# you will have two options for extracting this variable. first, you
# can use the alias
alphas <- as.data.frame(model1, variable = ?)
# alternatively, use the original name
alphas <- as.data.frame(model1, variable = ?)
# plot a histogram
hist(alphas[,1], xlab = expression(alpha),
main = '', col = 'darkred', border = 'white')
Now on to the topic of model comparisons. Plot Dunn-Smyth residuals for this dynamic model to see that there is no more autocorrelation
plot(
model2,
type = 'residuals'
)
However we must be cautious interpreting these residuals because the AR1 trend model is extremely flexible. The in-sample fits can sometimes be nearly perfect, meaning the residuals will very often look excellent! We can confirm this behaviour by inspecting hindcasts
hcs <- hindcast(model2)
plot(hcs)
## No non-missing values in test_observations; cannot calculate forecast score
But PPCs can still give valid insights into model inadequacy:
pp_check(
model2,
type = 'dens_overlay',
ndraws = 100
)
pp_check(
model2,
type = 'ecdf_overlay',
ndraws = 100
)
These plots show that the model clearly gives a better fit to the data compared to the Poisson model above. But there are some more worrying and pressing matters we can address with this model. Inspect the summary to see what I mean:
summary(model2)
## GAM formula:
## recruits ~ offset(log_spawners) + s(pdo_winter_t2, k = 3, bs = "cr") +
## s(pdo_summer_t2, k = 3, bs = "cr") + ti(pdo_winter_t2, pdo_summer_t2,
## k = 3, bs = "cr")
##
## Family:
## negative binomial
##
## Link function:
## log
##
## Trend model:
## AR()
##
##
## N series:
## 1
##
## N timepoints:
## 50
##
## Status:
## Fitted using Stan
## 4 chains, each with iter = 1000; warmup = 500; thin = 1
## Total post-warmup draws = 2000
##
##
## Observation dispersion parameter estimates:
## 2.5% 50% 97.5% Rhat n_eff
## phi[1] 3.9 12 47 1.03 137
##
## GAM coefficient (beta) estimates:
## 2.5% 50% 97.5% Rhat n_eff
## (Intercept) 0.21 0.560 0.89 1.02 391
## s(pdo_winter_t2).1 -0.51 -0.016 0.48 1.00 1162
## s(pdo_winter_t2).2 -0.44 0.027 0.48 1.00 866
## s(pdo_summer_t2).1 -0.64 -0.089 0.38 1.01 547
## s(pdo_summer_t2).2 -0.45 0.017 0.47 1.00 1011
## ti(pdo_winter_t2,pdo_summer_t2).1 -0.66 -0.038 0.49 1.00 507
## ti(pdo_winter_t2,pdo_summer_t2).2 -0.66 -0.015 0.52 1.00 561
## ti(pdo_winter_t2,pdo_summer_t2).3 -0.49 0.030 0.66 1.00 323
## ti(pdo_winter_t2,pdo_summer_t2).4 -0.41 0.093 0.87 1.01 417
##
## Approximate significance of GAM smooths:
## edf Ref.df Chi.sq p-value
## s(pdo_winter_t2) 1.02e-04 2 0.02 0.99
## s(pdo_summer_t2) 2.40e-05 2 0.30 0.93
## ti(pdo_winter_t2,pdo_summer_t2) 1.24e-05 4 0.20 1.00
##
## Latent trend parameter AR estimates:
## 2.5% 50% 97.5% Rhat n_eff
## ar1[1] 0.25 0.63 0.97 1.01 450
## sigma[1] 0.47 0.68 0.94 1.03 153
##
## Stan MCMC diagnostics:
## ✔ No issues with effective samples per iteration
## ✖ Rhats above 1.05 found for some parameters
## Use pairs() and mcmc_plot() to investigate
## ✔ No issues with divergences
## ✔ No issues with maximum tree depth
##
## Samples were drawn using sampling(hmc). For each parameter, n_eff is a
## crude measure of effective sample size, and Rhat is the potential scale
## reduction factor on split MCMC chains (at convergence, Rhat = 1)
##
## Use how_to_cite() to get started describing this model
The HMC sampler is telling us there are some issues when trying to
implement both a flexible latent dynamic process AND an overdispersion
parameter. Both of these components are competing to capture the extra
dispersion in the observations, leading to low effective sample sizes
and poor convergence for the overdispersion parameter \(\phi\). As the summary()
suggests, it is useful to look at estimates for these parameters using
the diagnostic functions pairs()
and
mcmc_plot()
:
pairs(
model2,
variable = c('phi',
'sigma'),
regex = TRUE
)
The “funnel” shape in the top-right shows that \(\sigma\) can only be estimated freely when \(\phi\) gets a bit larger. As \(\phi\) becomes small, the Negative Binomial distribution has more flexibility to capture dispersion and so there is less need for a flexible latent process model. But when \(\phi\) becomes large, the Negative Binomial starts to behave like a Poisson and there is more need for a flexible latent process, hence the need for a larger \(\sigma\). This makes it challenging to estimate both simultaneously:
mcmc_plot(
model2,
variable = c('phi',
'sigma'),
regex = TRUE,
type = 'combo'
)
We now have several options:
Use containment priors to restrict one or both of the processes (i.e. pull the overdispersion parameter towards very large values to make the observations behave like a Poisson a priori)
Keep the Negative Binomial observation model and use a smooth trend, such as a Gaussian Process or a Piecewise Logistic
Drop the Negative Binomial observation model (using a Poisson instead) and use a flexible Autoregressive trend
Here we will illustrate the latter two options. First, we try the
Negative Binomial observation model with a smooth Piecewise Logistic
function for the trend. This trend model can be useful for population
dynamics as it follows one of ecology’s classic population growth
models. However, we must supply a value for the carrying capacity of the
trend process. Here we will assume the maximum value the logistic growth
trend could take is the same as the maximum value that the latent AR1
process from model2
took. Note that we also must suppress
the intercept in the formula because piecewise trends have their own
intercept parameters.
model_data$cap <- exp(
max(hindcast(model2,
type = 'trend')$hindcasts$sockeye)
)
model3 <- mvgam(
recruits ~
offset(log_spawners) +
s(pdo_winter_t2,
k = 3,
bs = 'cr') +
s(pdo_summer_t2,
k = 3,
bs = 'cr') +
ti(pdo_winter_t2,
pdo_summer_t2,
k = 3,
bs = 'cr') - 1,
family = nb(),
trend_model = PW(growth = 'logistic'),
data = model_data
)
Next we use a Poisson observation model with a AR1 trend
model4 <- mvgam(
recruits ~
offset(log_spawners) +
s(pdo_winter_t2,
k = 3,
bs = 'cr') +
s(pdo_summer_t2,
k = 3,
bs = 'cr') +
ti(pdo_winter_t2,
pdo_summer_t2,
k = 3,
bs = 'cr'),
family = poisson(),
trend_model = AR(),
noncentred = TRUE,
data = model_data
)
Neither of these models has any major issues with sampling, which is a good sign!
summary(model3)
## GAM formula:
## recruits ~ s(pdo_winter_t2, k = 3, bs = "cr") + s(pdo_summer_t2,
## k = 3, bs = "cr") + ti(pdo_winter_t2, pdo_summer_t2, k = 3,
## bs = "cr") + offset(log_spawners) - 1
##
## Family:
## negative binomial
##
## Link function:
## log
##
## Trend model:
## PW(growth = "logistic")
##
##
## N series:
## 1
##
## N timepoints:
## 50
##
## Status:
## Fitted using Stan
## 4 chains, each with iter = 1000; warmup = 500; thin = 1
## Total post-warmup draws = 2000
##
##
## Observation dispersion parameter estimates:
## 2.5% 50% 97.5% Rhat n_eff
## phi[1] 1.2 1.8 2.7 1 1237
##
## GAM coefficient (beta) estimates:
## 2.5% 50% 97.5% Rhat n_eff
## s(pdo_winter_t2).1 -0.67 -0.0770 0.40 1.00 1220
## s(pdo_winter_t2).2 -0.37 0.0790 0.62 1.00 769
## s(pdo_summer_t2).1 -0.74 -0.0820 0.44 1.00 744
## s(pdo_summer_t2).2 -0.50 -0.0072 0.49 1.00 815
## ti(pdo_winter_t2,pdo_summer_t2).1 -1.50 -0.0830 0.45 1.03 103
## ti(pdo_winter_t2,pdo_summer_t2).2 -2.00 -0.0330 0.58 1.04 92
## ti(pdo_winter_t2,pdo_summer_t2).3 -0.50 0.0700 1.60 1.04 105
## ti(pdo_winter_t2,pdo_summer_t2).4 -0.44 0.1000 1.80 1.03 89
##
## Approximate significance of GAM smooths:
## edf Ref.df Chi.sq p-value
## s(pdo_winter_t2) 1.30e-06 2 0.61 0.85
## s(pdo_summer_t2) 2.47e-07 2 0.28 0.95
## ti(pdo_winter_t2,pdo_summer_t2) 3.98e-06 4 0.13 1.00
##
## Latent trend growth rate estimates:
## 2.5% 50% 97.5% Rhat n_eff
## k_trend[1] -0.24 -0.12 -0.027 1 899
##
## Latent trend offset estimates:
## 2.5% 50% 97.5% Rhat n_eff
## m_trend[1] -2.2 -0.3 1.6 1 2250
##
## Stan MCMC diagnostics:
## ✔ No issues with effective samples per iteration
## ✔ Rhat looks good for all parameters
## ✔ No issues with divergences
## ✔ No issues with maximum tree depth
##
## Samples were drawn using sampling(hmc). For each parameter, n_eff is a
## crude measure of effective sample size, and Rhat is the potential scale
## reduction factor on split MCMC chains (at convergence, Rhat = 1)
##
## Use how_to_cite() to get started describing this model
summary(model4)
## GAM formula:
## recruits ~ offset(log_spawners) + s(pdo_winter_t2, k = 3, bs = "cr") +
## s(pdo_summer_t2, k = 3, bs = "cr") + ti(pdo_winter_t2, pdo_summer_t2,
## k = 3, bs = "cr")
##
## Family:
## poisson
##
## Link function:
## log
##
## Trend model:
## AR()
##
##
## N series:
## 1
##
## N timepoints:
## 50
##
## Status:
## Fitted using Stan
## 4 chains, each with iter = 1000; warmup = 500; thin = 1
## Total post-warmup draws = 2000
##
##
## GAM coefficient (beta) estimates:
## 2.5% 50% 97.5% Rhat n_eff
## (Intercept) 0.18 0.4900 0.82 1.01 301
## s(pdo_winter_t2).1 -0.52 -0.0100 0.47 1.00 1041
## s(pdo_winter_t2).2 -0.47 0.0300 0.47 1.00 1069
## s(pdo_summer_t2).1 -0.70 -0.1100 0.31 1.01 811
## s(pdo_summer_t2).2 -0.49 0.0043 0.48 1.00 645
## ti(pdo_winter_t2,pdo_summer_t2).1 -0.73 -0.0340 0.45 1.02 302
## ti(pdo_winter_t2,pdo_summer_t2).2 -0.77 -0.0073 0.56 1.02 219
## ti(pdo_winter_t2,pdo_summer_t2).3 -0.42 0.0490 0.77 1.01 260
## ti(pdo_winter_t2,pdo_summer_t2).4 -0.40 0.0870 0.93 1.02 241
##
## Approximate significance of GAM smooths:
## edf Ref.df Chi.sq p-value
## s(pdo_winter_t2) 0.003532 2 0.03 0.98
## s(pdo_summer_t2) 0.000275 2 0.42 0.90
## ti(pdo_winter_t2,pdo_summer_t2) 0.000134 4 0.72 0.99
##
## Latent trend parameter AR estimates:
## 2.5% 50% 97.5% Rhat n_eff
## ar1[1] 0.22 0.48 0.67 1.01 380
## sigma[1] 0.64 0.78 0.99 1.02 409
##
## Stan MCMC diagnostics:
## ✔ No issues with effective samples per iteration
## ✔ Rhat looks good for all parameters
## ✔ No issues with divergences
## ✖ 764 of 2000 iterations saturated the maximum tree depth of 10 (38.2%)
## Try a larger max_treedepth to avoid saturation
##
## Samples were drawn using sampling(hmc). For each parameter, n_eff is a
## crude measure of effective sample size, and Rhat is the potential scale
## reduction factor on split MCMC chains (at convergence, Rhat = 1)
##
## Use how_to_cite() to get started describing this model
But hindcasts and trend estimates are very different for the two models:
plot(
model3,
type = 'trend'
)
plot(
model4,
type = 'trend'
)
plot(
hindcast(model3)
)
## No non-missing values in test_observations; cannot calculate forecast score
plot(
hindcast(model4)
)
## No non-missing values in test_observations; cannot calculate forecast score
Here the extreme flexibility of the AR1 trend is apparent in the
hindcasts for model4
. Hopefully this gives you more of an
idea about why we can’t put too much faith in residual or fitted value
diagnostics for these flexible dynamic models. Ok so we cannot use
in-sample fits or residual diagnostics to guide reasonable comparisons
of these two models. It would be more appropriate to use 1-step ahead
residuals for interrogation. But computing these is extremely
computationally demanding. How then can we compare models? Using
loo_compare()
, we can ask which model would likely give
better predictions for new data within the training set. This
method of approximating leave-one-out prediction error (using a method
called Pareto Smoothed Importance Sampling (PSIS)) is widely-used for
comparing Bayesian models, and is generally interpreted in the same way
that AIC comparisons can be interpreted:
loo_compare(
model3,
model4
)
## Warning: Some Pareto k diagnostic values are too high. See help('pareto-k-diagnostic') for details.
## Warning: Some Pareto k diagnostic values are too high. See help('pareto-k-diagnostic') for details.
## elpd_diff se_diff
## model3 0.0 0.0
## model4 -443317.4 68161.1
The Piecewise / Negative Binomial model is strongly favoured here,
yielding much higher values of the Expected Log Predictive Density (ELPD; also known as the
log score). This is one useful metric for choosing among models, but
it has a few downfalls. First, it is challenging to use PSIS-LOO for
time series models, particularly those that have flexible autoregressive
latent processes (like model4
has above). These
autoregressive processes can fit the training data very well, but when
we ask how they would generalize to new data within the training period
they can give poor predictions because they don’t have any past state
values to learn from. This is one reason why loo()
very
often gives warnings for these models. My recommendation would
be to only use loo()
and loo_compare()
when
comparing models that have different types of covariate effects, but
that do not have different types of dynamic trends. Once you’ve
selected your covariate model, then you can add an appropriate trend if
needed. Alternatively, you could instead compare them solely on their
abilities to predict future out-of-sample data. This is where
leave-future-out cross-validation may be handy.
mvgam
includes functions that facilitate
leave-future-out comparisons. Time series models are often evaluated
using an expanding window training technique, where the model is
initially trained on some subset of data from \(t = 1\) to \(t =
n_{train}\), and then is used to produce forecasts for the next
\(h\) time steps \((t = n_{train} + h)\). In the next
iteration, the size of training data is expanded by a single time point
and the process repeated. This is obviously computationally challenging
for Bayesian time series models, as the number of refits can be very
large. mvgam
uses an approximation based on importance
sampling. Briefly, we refit the model using the first \(min_t\) observations to perform a single
exact \(h~ahead\) forecast step. This
forecast is evaluated against the \(min_t +
h\) out of sample observations using the ELPD as with the LOO
comparisons above. Next, we approximate each successive round of
expanding window forecasts by moving forward one step at a time \(for~i~in~1:N_{evaluations}\) and
re-weighting draws from the model’s posterior predictive distribution
using PSIS. In each iteration \(i\),
PSIS weights are obtained for all observations that would have been
included in the model if we had re-fit. If these importance ratios are
stable, we consider the approximation adequate and use the re-weighted
posterior’s forecast for evaluating the next holdout set of testing
observations \(((min_t + i + 1):(min_t + i +
h))\). This is similar to the process of particle filtering to
update forecasts in light of new data by re-weighting the posterior
draws using importance weights. But at some point the importance ratio
variability will become too large and importance sampling will be
unreliable. This is indicated by the estimated shape parameter k of the
generalized Pareto distribution crossing a certain threshold
pareto_k_threshold
. Only then do we refit the model using
all of the observations up to the time of the failure. We then restart
the process and iterate forward until the next refit is triggered. The
process, which is implemented using the lfo_cv()
function,
is computationally much more efficient, as only a fraction of the
evaluations typically requires refits (the algorithm is described in
detail by Bürkner et al. 2020).
Run the approximator for the first model, setting
min_t = 20
and using a forecast horizon of
fc_horizon = 2
lfo_pw <- lfo_cv(
model3,
min_t = 20,
fc_horizon = 2,
n_cores = 4
)
The S3
plotting function for these lfo_cv
objects will show the Pareto-k values and ELPD values over the
evaluation time points. For the Pareto k plot, a dashed red line
indicates the specified threshold chosen for triggering model refits.
For the ELPD plot, a dashed red line indicates the bottom 10% quantile
of ELPD values. Points below this threshold may represent outliers that
were more difficult to forecast
plot(lfo_pw)
The top plot (the Pareto k plot) indicates that we only needed one additional refit to give good approximate predictions for the Piecewise / Negative Binomial model
lfo_ar <- lfo_cv(
model4,
min_t = 20,
fc_horizon = 2,
n_cores = 4
)
plot(lfo_ar)
The Poisson / AR model, in contrast, needed to be refit many times to ensure the stability of importance-weighted predictions. We would generally expect more refits for this model as the underlying trend has less “memory” than the piecewise trend, so its forecasts could become irrelevant more quickly as a result.
Compare ELPDs for the two models using all nonmissing evaluation timepoints by plotting the differences in their scores. This plot will give us an indication if one model is consistently giving better h-step ahead forecasts when trained on successively more data
# determine which timepoints were not missing in the evaluation set
nonmiss_evalpoints <- which(
!is.na(model_data$recruits[22:NROW(model_data)])
)
# plot the PW ELPD values minus the AR ELPD values
# values will be > 0 if the PW model gave better predictions
# (ELPD is better when larger)
pw_minus_ar <- lfo_pw$elpds[nonmiss_evalpoints] -
lfo_ar$elpds[nonmiss_evalpoints]
plot(
pw_minus_ar,
ylab = 'PW - AR ELPD',
xlab = 'Time point',
pch = 16,
bty = 'l',
cex = 1,
col = 'white',
xaxt = 'n'
)
axis(
side = 1,
at = nonmiss_evalpoints,
labels = lfo_pw$eval_timepoints[nonmiss_evalpoints]
)
abline(
h = 0,
lty = 'dashed',
lwd = 2
)
# show all ELPD comparisons in black
points(
pw_minus_ar,
pch = 16,
cex = 0.95
)
# highlight the comparisons for when the PW predictions were better
points(
ifelse(pw_minus_ar > 0, pw_minus_ar, NA),
pch = 16,
cex = 0.95,
col = 'darkred'
)
The Piecewise / Negative Binomial model tends to give better
probabilistic predictions throughout the leave-future-out exercise. This
result agrees with results from the in-sample comparison using
loo()
above. But overall, I would tend to favour the
out-of-sample metrics more when performing model selection, especially
given that we are working with a time series here.
If we have enough computational power and our model is simple / small
enough, it can often be worthwhile to compute exact leave-future-out
forecasts. The update.mvgam()
function makes this
relatively simple to do in mvgam()
, as all we need to do is
supply the new testing and training datasets and the model will be
updated. Below I show how to do this for our chosen model
(model3
) by splitting on timepoint 36 to create training
and testing data.
data_train = model_data %>%
dplyr::filter(time <= 36)
data_test = model_data %>%
dplyr::filter(time > 36)
Now refit the model using the update.mvgam()
function.
Note that you can supply an argument to state lfo = TRUE
if
you wish to ignore the computation of residuals and lighten the final
model. This can be helpful if all you plan to do is compare forecasts,
as it will make repeated calls to the update()
function
faster. See ?mvgam::update.mvgam
for more details
model3_exact <- update(
model3,
data = data_train,
newdata = data_test,
lfo = TRUE
)
Once this model is finished, we are ready to compute exact forecast
evaluations. The first step in this process is to create an object of
class mvgam_forecast
using the
forecast.mvgam()
function:
fc <- forecast(model3_exact)
We’ve seen this class before when calling the
hindcast.mvgam()
or forecast.mvgam()
functions. But we haven’t looked in-depth at the other ways we can use
objects of this class. A primary purpose is to readily allow forecast
evalutions for each series in the data, using a variety of possible
scoring functions. See ?mvgam::score.mvgam_forecast
to view
the types of scores that are available. For these data, which have very
large counts, a useful scoring metric is the Continuous Rank Probability Score (CRPS). A CRPS
value is similar to what we might get if we calculated a weighted
absolute error using the full forecast distribution.
scores <- score(
fc,
score = 'crps'
)
scores$sockeye
## score in_interval interval_width eval_horizon score_type
## 1 6022.7278 0 0.9 1 crps
## 2 3560.9737 1 0.9 2 crps
## 3 6604.2062 1 0.9 3 crps
## 4 6794.8664 1 0.9 4 crps
## 5 881.5366 1 0.9 5 crps
## 6 1757.0302 1 0.9 6 crps
## 7 2622.2656 1 0.9 7 crps
## 8 NA NA 0.9 8 crps
## 9 NA NA 0.9 9 crps
## 10 NA NA 0.9 10 crps
## 11 NA NA 0.9 11 crps
## 12 NA NA 0.9 12 crps
## 13 NA NA 0.9 13 crps
## 14 NA NA 0.9 14 crps
The returned data.frame
now shows the CRPS score for
each evaluation in the testing data, along with some other useful
information about the fit of the forecast distribution. In particular,
we are given a logical value (1s and 0s) telling us whether the true
value was within a pre-specified credible interval (i.e. the coverage of
the forecast distribution). The default interval width is 0.9, so we
would hope that the values in the in_interval
column take a
1 approximately 90% of the time. This value can be changed if you wish
to compute different coverages, say using a 60% interval:
scores <- score(
fc,
score = 'crps',
interval_width = 0.6
)
scores$sockeye
## score in_interval interval_width eval_horizon score_type
## 1 6022.7278 0 0.6 1 crps
## 2 3560.9737 0 0.6 2 crps
## 3 6604.2062 0 0.6 3 crps
## 4 6794.8664 1 0.6 4 crps
## 5 881.5366 1 0.6 5 crps
## 6 1757.0302 0 0.6 6 crps
## 7 2622.2656 0 0.6 7 crps
## 8 NA NA 0.6 8 crps
## 9 NA NA 0.6 9 crps
## 10 NA NA 0.6 10 crps
## 11 NA NA 0.6 11 crps
## 12 NA NA 0.6 12 crps
## 13 NA NA 0.6 13 crps
## 14 NA NA 0.6 14 crps
Although we would prefer to use distributional forecast evaluations,
point evaluations are also possible using outputs from the
forecast.mvgam()
function. For example, we can calculate
the Mean Absolute Percentage Error (MAPE), which is a commonly used
metric to evaluate point forecasts when the scales of the time series
vary quite a lot:
# exctract the true values from the test set
truth <- fc$test_observations$sockeye
# calculate mean prediction values for each column in the forecast matrix
mean_fc <- apply(
fc$forecasts$sockeye,
2,
mean
)
# calculate the absolute percentage errors, which are scaled by the
# value of the truths. Note that we only use the first 7 observations
# in the test set because the rest are NA
p <- abs(
100 * (
(truth[1:7] - mean_fc[1:7]) /
mean_fc[1:7]
)
)
# take the mean of the absolute percentage errors
mean(p)
## [1] 73.3368
As I always recommend, it is so helpful to the developers that
dedicate much of their time to bringing us this software if we
acknowledge them through appropriate citations. The piecewise models in
particular deserve to be properly cited because they build on
contributions from a number of helpful sources. Use
how_to_cite()
to make this task easier:
how_to_cite(model3)
## Methods text skeleton
## We used the R package mvgam (version 1.1.51; Clark & Wells, 2023) to
## construct, fit and interrogate the model. mvgam fits Bayesian
## State-Space models that can include flexible predictor effects in both
## the process and observation components by incorporating functionalities
## from the brms (Burkner 2017), mgcv (Wood 2017) and splines2 (Wang & Yan,
## 2023) packages. Piecewise dynamic trends were parameterized and
## estimated following methods described by Taylor and Letham (2018). The
## mvgam-constructed model and observed data were passed to the
## probabilistic programming environment Stan (version 2.36.0; Carpenter et
## al. 2017, Stan Development Team 2025), specifically through the cmdstanr
## interface (Gabry & Cesnovar, 2021). We ran 4 Hamiltonian Monte Carlo
## chains for 500 warmup iterations and 500 sampling iterations for joint
## posterior estimation. Rank normalized split Rhat (Vehtari et al. 2021)
## and effective sample sizes were used to monitor convergence.
##
## Primary references
## Clark, NJ and Wells K (2023). Dynamic Generalized Additive Models
## (DGAMs) for forecasting discrete ecological time series. Methods in
## Ecology and Evolution, 14, 771-784. doi.org/10.1111/2041-210X.13974
## Burkner, PC (2017). brms: An R Package for Bayesian Multilevel Models
## Using Stan. Journal of Statistical Software, 80(1), 1-28.
## doi:10.18637/jss.v080.i01
## Wood, SN (2017). Generalized Additive Models: An Introduction with R
## (2nd edition). Chapman and Hall/CRC.
## Wang W and Yan J (2021). Shape-Restricted Regression Splines with R
## Package splines2. Journal of Data Science, 19(3), 498-517.
## doi:10.6339/21-JDS1020 https://doi.org/10.6339/21-JDS1020.
## Taylor S and Letham B (2018). Forecasting at scale. The American
## Statistician 72(1) 37-45. https://doi.org/10.1080/00031305.2017.1380080
## Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B, Betancourt M,
## Brubaker M, Guo J, Li P and Riddell A (2017). Stan: A probabilistic
## programming language. Journal of Statistical Software 76.
## Gabry J, Cesnovar R, Johnson A, and Bronder S (2025). cmdstanr: R
## Interface to 'CmdStan'. https://mc-stan.org/cmdstanr/,
## https://discourse.mc-stan.org.
## Vehtari A, Gelman A, Simpson D, Carpenter B, and Burkner P (2021).
## Rank-normalization, folding, and localization: An improved Rhat for
## assessing convergence of MCMC (with discussion). Bayesian Analysis 16(2)
## 667-718. https://doi.org/10.1214/20-BA1221.
##
## Other useful references
## Arel-Bundock V, Greifer N, and Heiss A (2024). How to interpret
## statistical models using marginaleffects for R and Python. Journal of
## Statistical Software, 111(9), 1-32.
## https://doi.org/10.18637/jss.v111.i09
## Gabry J, Simpson D, Vehtari A, Betancourt M, and Gelman A (2019).
## Visualization in Bayesian workflow. Journal of the Royal Statatistical
## Society A, 182, 389-402. doi:10.1111/rssa.12378.
## Vehtari A, Gelman A, and Gabry J (2017). Practical Bayesian model
## evaluation using leave-one-out cross-validation and WAIC. Statistics and
## Computing, 27, 1413-1432. doi:10.1007/s11222-016-9696-4.
## Burkner PC, Gabry J, and Vehtari A. (2020). Approximate leave-future-out
## cross-validation for Bayesian time series models. Journal of Statistical
## Computation and Simulation, 90(14), 2499-2523.
## https://doi.org/10.1080/00949655.2020.1783262
fc
object using the formula provided
by Hyndman and Athanasopoulosmodel3
(i.e. split the data on timepoint 37 rather than
36). Compare CRPS values from this model to the previous refit to see
whether the scores for timepoints 38-43 have changed when conditioning
on the extra information in timepoint 37sessionInfo()
## R version 4.4.2 (2024-10-31 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 10 x64 (build 19045)
##
## Matrix products: default
##
##
## locale:
## [1] LC_COLLATE=English_Australia.utf8 LC_CTYPE=English_Australia.utf8
## [3] LC_MONETARY=English_Australia.utf8 LC_NUMERIC=C
## [5] LC_TIME=English_Australia.utf8
##
## time zone: Australia/Brisbane
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] ggplot2_3.5.1 tidybayes_3.0.7 marginaleffects_0.25.0
## [4] rstan_2.32.6 StanHeaders_2.32.10 mvgam_1.1.51
## [7] dplyr_1.1.4 downloadthis_0.4.1
##
## loaded via a namespace (and not attached):
## [1] svUnit_1.0.6 tidyselect_1.2.1 farver_2.1.2
## [4] loo_2.8.0.9000 fastmap_1.2.0 tensorA_0.36.2.1
## [7] digest_0.6.37 timechange_0.3.0 mime_0.12
## [10] lifecycle_1.0.4 processx_3.8.6 magrittr_2.0.3
## [13] posterior_1.6.0.9000 compiler_4.4.2 rlang_1.1.5
## [16] sass_0.4.9 tools_4.4.2 utf8_1.2.4
## [19] yaml_2.3.10 collapse_2.0.19 data.table_1.17.0
## [22] knitr_1.49 labeling_0.4.3 bridgesampling_1.1-2
## [25] pkgbuild_1.4.6 curl_6.2.1 cmdstanr_0.8.0
## [28] plyr_1.8.9 abind_1.4-8 klippy_0.0.0.9500
## [31] withr_3.0.2 purrr_1.0.4 grid_4.4.2
## [34] stats4_4.4.2 colorspace_2.1-1 inline_0.3.21
## [37] scales_1.3.0 insight_1.0.2 cli_3.6.4
## [40] mvtnorm_1.3-3 rmarkdown_2.29 generics_0.1.3
## [43] RcppParallel_5.1.10 rstudioapi_0.17.1 reshape2_1.4.4
## [46] cachem_1.1.0 stringr_1.5.1 splines_4.4.2
## [49] bayesplot_1.11.1.9000 assertthat_0.2.1 parallel_4.4.2
## [52] matrixStats_1.5.0 brms_2.22.9 vctrs_0.6.5
## [55] V8_6.0.1 Matrix_1.7-1 jsonlite_1.9.0
## [58] gratia_0.10.0 arrayhelpers_1.1-0 patchwork_1.3.0
## [61] ggdist_3.3.2 b64_0.1.3 jquerylib_0.1.4
## [64] tidyr_1.3.1 glue_1.8.0 ps_1.9.0
## [67] ggokabeito_0.1.0 codetools_0.2-20 mvnfast_0.2.8
## [70] distributional_0.5.0 lubridate_1.9.4 stringi_1.8.4
## [73] gtable_0.3.6 QuickJSR_1.5.2 munsell_0.5.1
## [76] tibble_3.2.1 pillar_1.10.1 htmltools_0.5.8.1
## [79] Brobdingnag_1.2-9 R6_2.6.1 evaluate_1.0.3
## [82] lattice_0.22-6 extraDistr_1.10.0 backports_1.5.0
## [85] bslib_0.9.0 rstantools_2.4.0.9000 bsplus_0.1.4
## [88] Rcpp_1.0.14 zip_2.3.2 coda_0.19-4.1
## [91] gridExtra_2.3 nlme_3.1-166 checkmate_2.3.2
## [94] mgcv_1.9-1 xfun_0.51 fs_1.6.5
## [97] pkgconfig_2.0.3