Skip to contents

Compute probabilistic forecast scores for mvgam objects

Usage

# S3 method for mvgam_forecast
score(
  object,
  score = "crps",
  log = FALSE,
  weights,
  interval_width = 0.9,
  n_cores = 1,
  ...
)

score(object, ...)

Arguments

object

mvgam_forecast object. See forecast.mvgam().

score

character specifying the type of proper scoring rule to use for evaluation. Options are: sis (i.e. the Scaled Interval Score), energy, variogram, elpd (i.e. the Expected log pointwise Predictive Density), drps (i.e. the Discrete Rank Probability Score), crps (the Continuous Rank Probability Score) or brier (the latter of which is only applicable for bernoulli models. Note that when choosing elpd, the supplied object must have forecasts on the link scale so that expectations can be calculated prior to scoring. If choosing brier, the object must have forecasts on the expected scale (i.e. probability predictions). For all other scores, forecasts should be supplied on the response scale (i.e. posterior predictions)

log

logical. Should the forecasts and truths be logged prior to scoring? This is often appropriate for comparing performance of models when series vary in their observation ranges. Ignored if score = 'brier'

weights

optional vector of weights (where length(weights) == n_series) for weighting pairwise correlations when evaluating the variogram score for multivariate forecasts. Useful for down-weighting series that have larger magnitude observations or that are of less interest when forecasting. Ignored if score != 'variogram'

interval_width

proportional value on [0.05,0.95] defining the forecast interval for calculating coverage and, if score = 'sis', for calculating the interval score. Ignored if score = 'brier'

n_cores

integer specifying number of cores for calculating scores in parallel

...

Ignored

Value

a list containing scores and interval coverages per forecast horizon. If score %in% c('drps', 'crps', 'elpd', 'brier'), the list will also contain return the sum of all series-level scores per horizon. If score %in% c('energy','variogram'), no series-level scores are computed and the only score returned will be for all series. For all scores apart from elpd and brier, the in_interval column in each series-level slot is a binary indicator of whether or not the true value was within the forecast's corresponding posterior empirical quantiles. Intervals are not calculated when using elpd because forecasts will only contain the linear predictors

Examples

# \donttest{
# Simulate observations for three count-valued time series
data <- sim_mvgam()
# Fit a dynamic model using 'newdata' to automatically produce forecasts
mod <- mvgam(y ~ 1,
            trend_model = RW(),
            data = data$data_train,
            newdata = data$data_test,
            chains = 2)
#> Your model may benefit from using "noncentred = TRUE"
#> Compiling Stan program using cmdstanr
#> 
#> In file included from stan/lib/stan_math/stan/math/prim/prob/von_mises_lccdf.hpp:5,
#>                  from stan/lib/stan_math/stan/math/prim/prob/von_mises_ccdf_log.hpp:4,
#>                  from stan/lib/stan_math/stan/math/prim/prob.hpp:359,
#>                  from stan/lib/stan_math/stan/math/prim.hpp:16,
#>                  from stan/lib/stan_math/stan/math/rev.hpp:16,
#>                  from stan/lib/stan_math/stan/math.hpp:19,
#>                  from stan/src/stan/model/model_header.hpp:4,
#>                  from C:/Users/uqnclar2/AppData/Local/Temp/RtmpsvBanv/model-1dc433485786.hpp:2:
#> stan/lib/stan_math/stan/math/prim/prob/von_mises_cdf.hpp: In function 'stan::return_type_t<T_x, T_sigma, T_l> stan::math::von_mises_cdf(const T_x&, const T_mu&, const T_k&)':
#> stan/lib/stan_math/stan/math/prim/prob/von_mises_cdf.hpp:194: note: '-Wmisleading-indentation' is disabled from this point onwards, since column-tracking was disabled due to the size of the code/headers
#>   194 |       if (cdf_n < 0.0)
#>       | 
#> stan/lib/stan_math/stan/math/prim/prob/von_mises_cdf.hpp:194: note: adding '-flarge-source-files' will allow for more column-tracking support, at the expense of compilation time and memory
#> Start sampling
#> Running MCMC with 2 parallel chains...
#> 
#> Chain 1 Iteration:   1 / 1000 [  0%]  (Warmup) 
#> Chain 2 Iteration:   1 / 1000 [  0%]  (Warmup) 
#> Chain 1 Iteration: 100 / 1000 [ 10%]  (Warmup) 
#> Chain 2 Iteration: 100 / 1000 [ 10%]  (Warmup) 
#> Chain 1 Iteration: 200 / 1000 [ 20%]  (Warmup) 
#> Chain 2 Iteration: 200 / 1000 [ 20%]  (Warmup) 
#> Chain 1 Iteration: 300 / 1000 [ 30%]  (Warmup) 
#> Chain 1 Iteration: 400 / 1000 [ 40%]  (Warmup) 
#> Chain 2 Iteration: 300 / 1000 [ 30%]  (Warmup) 
#> Chain 1 Iteration: 500 / 1000 [ 50%]  (Warmup) 
#> Chain 1 Iteration: 501 / 1000 [ 50%]  (Sampling) 
#> Chain 2 Iteration: 400 / 1000 [ 40%]  (Warmup) 
#> Chain 2 Iteration: 500 / 1000 [ 50%]  (Warmup) 
#> Chain 2 Iteration: 501 / 1000 [ 50%]  (Sampling) 
#> Chain 1 Iteration: 600 / 1000 [ 60%]  (Sampling) 
#> Chain 2 Iteration: 600 / 1000 [ 60%]  (Sampling) 
#> Chain 1 Iteration: 700 / 1000 [ 70%]  (Sampling) 
#> Chain 2 Iteration: 700 / 1000 [ 70%]  (Sampling) 
#> Chain 1 Iteration: 800 / 1000 [ 80%]  (Sampling) 
#> Chain 2 Iteration: 800 / 1000 [ 80%]  (Sampling) 
#> Chain 1 Iteration: 900 / 1000 [ 90%]  (Sampling) 
#> Chain 2 Iteration: 900 / 1000 [ 90%]  (Sampling) 
#> Chain 1 Iteration: 1000 / 1000 [100%]  (Sampling) 
#> Chain 2 Iteration: 1000 / 1000 [100%]  (Sampling) 
#> Chain 1 finished in 1.4 seconds.
#> Chain 2 finished in 1.4 seconds.
#> 
#> Both chains finished successfully.
#> Mean chain execution time: 1.4 seconds.
#> Total execution time: 1.6 seconds.
#> 

# Extract forecasts into a 'mvgam_forecast' object
fc <- forecast(mod)
plot(fc)
#> Out of sample CRPS:
#> 131.606402


# Compute Discrete Rank Probability Scores and 0.90 interval coverages
fc_scores <- score(fc, score = 'drps')
str(fc_scores)
#> List of 4
#>  $ series_1  :'data.frame':	25 obs. of  5 variables:
#>   ..$ score         : num [1:25] 2.44 1.21 1.04 1.07 1.47 ...
#>   ..$ in_interval   : num [1:25] 1 1 1 1 1 1 1 1 1 1 ...
#>   ..$ interval_width: num [1:25] 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 ...
#>   ..$ eval_horizon  : int [1:25] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ score_type    : chr [1:25] "drps" "drps" "drps" "drps" ...
#>  $ series_2  :'data.frame':	25 obs. of  5 variables:
#>   ..$ score         : num [1:25] 1.651 0.863 0.948 1.307 2.038 ...
#>   ..$ in_interval   : num [1:25] 1 1 1 1 1 1 1 1 1 1 ...
#>   ..$ interval_width: num [1:25] 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 ...
#>   ..$ eval_horizon  : int [1:25] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ score_type    : chr [1:25] "drps" "drps" "drps" "drps" ...
#>  $ series_3  :'data.frame':	25 obs. of  5 variables:
#>   ..$ score         : num [1:25] 2.223 0.621 1.102 1.27 1.201 ...
#>   ..$ in_interval   : num [1:25] 1 1 1 1 1 1 1 1 1 1 ...
#>   ..$ interval_width: num [1:25] 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 ...
#>   ..$ eval_horizon  : int [1:25] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ score_type    : chr [1:25] "drps" "drps" "drps" "drps" ...
#>  $ all_series:'data.frame':	25 obs. of  3 variables:
#>   ..$ score       : num [1:25] 6.31 2.69 3.09 3.65 4.71 ...
#>   ..$ eval_horizon: int [1:25] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ score_type  : chr [1:25] "sum_drps" "sum_drps" "sum_drps" "sum_drps" ...

# An example using binary data
data <- sim_mvgam(family = bernoulli())
mod <- mvgam(y ~ s(season, bs = 'cc', k = 6),
            trend_model = AR(),
            data = data$data_train,
            newdata = data$data_test,
            family = bernoulli(),
            chains = 2)
#> Your model may benefit from using "noncentred = TRUE"
#> Compiling Stan program using cmdstanr
#> 
#> In file included from stan/lib/stan_math/stan/math/prim/prob/von_mises_lccdf.hpp:5,
#>                  from stan/lib/stan_math/stan/math/prim/prob/von_mises_ccdf_log.hpp:4,
#>                  from stan/lib/stan_math/stan/math/prim/prob.hpp:359,
#>                  from stan/lib/stan_math/stan/math/prim.hpp:16,
#>                  from stan/lib/stan_math/stan/math/rev.hpp:16,
#>                  from stan/lib/stan_math/stan/math.hpp:19,
#>                  from stan/src/stan/model/model_header.hpp:4,
#>                  from C:/Users/uqnclar2/AppData/Local/Temp/RtmpsvBanv/model-1dc437b27310.hpp:2:
#> stan/lib/stan_math/stan/math/prim/prob/von_mises_cdf.hpp: In function 'stan::return_type_t<T_x, T_sigma, T_l> stan::math::von_mises_cdf(const T_x&, const T_mu&, const T_k&)':
#> stan/lib/stan_math/stan/math/prim/prob/von_mises_cdf.hpp:194: note: '-Wmisleading-indentation' is disabled from this point onwards, since column-tracking was disabled due to the size of the code/headers
#>   194 |       if (cdf_n < 0.0)
#>       | 
#> stan/lib/stan_math/stan/math/prim/prob/von_mises_cdf.hpp:194: note: adding '-flarge-source-files' will allow for more column-tracking support, at the expense of compilation time and memory
#> Start sampling
#> Running MCMC with 2 parallel chains...
#> 
#> Chain 1 Iteration:   1 / 1000 [  0%]  (Warmup) 
#> Chain 2 Iteration:   1 / 1000 [  0%]  (Warmup) 
#> Chain 1 Iteration: 100 / 1000 [ 10%]  (Warmup) 
#> Chain 1 Iteration: 200 / 1000 [ 20%]  (Warmup) 
#> Chain 2 Iteration: 100 / 1000 [ 10%]  (Warmup) 
#> Chain 1 Iteration: 300 / 1000 [ 30%]  (Warmup) 
#> Chain 2 Iteration: 200 / 1000 [ 20%]  (Warmup) 
#> Chain 2 Iteration: 300 / 1000 [ 30%]  (Warmup) 
#> Chain 1 Iteration: 400 / 1000 [ 40%]  (Warmup) 
#> Chain 2 Iteration: 400 / 1000 [ 40%]  (Warmup) 
#> Chain 1 Iteration: 500 / 1000 [ 50%]  (Warmup) 
#> Chain 1 Iteration: 501 / 1000 [ 50%]  (Sampling) 
#> Chain 2 Iteration: 500 / 1000 [ 50%]  (Warmup) 
#> Chain 2 Iteration: 501 / 1000 [ 50%]  (Sampling) 
#> Chain 2 Iteration: 600 / 1000 [ 60%]  (Sampling) 
#> Chain 1 Iteration: 600 / 1000 [ 60%]  (Sampling) 
#> Chain 1 Iteration: 700 / 1000 [ 70%]  (Sampling) 
#> Chain 2 Iteration: 700 / 1000 [ 70%]  (Sampling) 
#> Chain 1 Iteration: 800 / 1000 [ 80%]  (Sampling) 
#> Chain 2 Iteration: 800 / 1000 [ 80%]  (Sampling) 
#> Chain 2 Iteration: 900 / 1000 [ 90%]  (Sampling) 
#> Chain 1 Iteration: 900 / 1000 [ 90%]  (Sampling) 
#> Chain 2 Iteration: 1000 / 1000 [100%]  (Sampling) 
#> Chain 2 finished in 1.4 seconds.
#> Chain 1 Iteration: 1000 / 1000 [100%]  (Sampling) 
#> Chain 1 finished in 1.6 seconds.
#> 
#> Both chains finished successfully.
#> Mean chain execution time: 1.5 seconds.
#> Total execution time: 1.7 seconds.
#> 

# Extract forecasts on the expectation (probability) scale
fc <- forecast(mod, type = 'expected')
plot(fc)
#> Out of sample Brier:
#> 7.56125891868666


# Compute Brier scores
fc_scores <- score(fc, score = 'brier')
str(fc_scores)
#> List of 4
#>  $ series_1  :'data.frame':	25 obs. of  5 variables:
#>   ..$ score         : num [1:25] 0.323 0.232 0.224 0.215 0.443 ...
#>   ..$ in_interval   : num [1:25] NA NA NA NA NA NA NA NA NA NA ...
#>   ..$ interval_width: num [1:25] 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 ...
#>   ..$ eval_horizon  : int [1:25] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ score_type    : chr [1:25] "brier" "brier" "brier" "brier" ...
#>  $ series_2  :'data.frame':	25 obs. of  5 variables:
#>   ..$ score         : num [1:25] 0.239 0.193 0.182 0.189 0.404 ...
#>   ..$ in_interval   : num [1:25] NA NA NA NA NA NA NA NA NA NA ...
#>   ..$ interval_width: num [1:25] 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 ...
#>   ..$ eval_horizon  : int [1:25] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ score_type    : chr [1:25] "brier" "brier" "brier" "brier" ...
#>  $ series_3  :'data.frame':	25 obs. of  5 variables:
#>   ..$ score         : num [1:25] 0.19 0.175 0.157 0.442 0.193 ...
#>   ..$ in_interval   : num [1:25] NA NA NA NA NA NA NA NA NA NA ...
#>   ..$ interval_width: num [1:25] 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 ...
#>   ..$ eval_horizon  : int [1:25] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ score_type    : chr [1:25] "brier" "brier" "brier" "brier" ...
#>  $ all_series:'data.frame':	25 obs. of  3 variables:
#>   ..$ score       : num [1:25] 0.752 0.6 0.563 0.845 1.04 ...
#>   ..$ eval_horizon: int [1:25] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ score_type  : chr [1:25] "sum_brier" "sum_brier" "sum_brier" "sum_brier" ...
# }