Skip to contents

Set up latent correlated multivariate Gaussian residual processes in mvgam. This function does not evaluate it's arguments – it exists purely to help set up a model with particular error processes.

Usage

ZMVN(unit = time, gr = NA, subgr = series)

Arguments

unit

The unquoted name of the variable that represents the unit of analysis in data over which latent residuals should be correlated. This variable should be either a numeric or integer variable in the supplied data. Defaults to time to be consistent with other functionalities in mvgam, though note that the data need not be time series in this case. See examples below for further details and explanations

gr

An optional grouping variable, which must be a factor in the supplied data, for setting up hierarchical residual correlation structures. If specified, this will automatically set up a model where the residual correlations for a specific level of gr are modelled hierarchically: \(\Omega_{group} = p\Omega_{global} + (1 - p)\Omega_{group, local}\), where \(\Omega_{global}\) is a global correlation matrix, \(\Omega_{group, local}\) is a local deviation correlation matrix and \(p\) is a weighting parameter controlling how strongly the local correlation matrix \(\Omega_{group}\) is shrunk towards the global correlation matrix \(\Omega_{global}\). If gr is supplied then subgr must also be supplied

subgr

A subgrouping factor variable specifying which element in data represents the different observational units. Defaults to series to be consistent with other functionalities in mvgam, though note that the data need not be time series in this case. But note that models that use the hierarchical correlations (by supplying a value for gr) should not include a series element in data. Rather, this element will be created internally based on the supplied variables for gr and subgr. For example, if you are modelling counts for a group of species (labelled as species in the data) across sampling sites (labelled as site in the data) in three different geographical regions (labelled as region), and you would like the residuals to be correlated within regions, then you should specify unit = site, gr = region, and subgr = species. Internally, mvgam() will appropriately order the data by unit (in this case, by site) and create the series element for the data using something like: series = as.factor(paste0(group, '_', subgroup))

Value

An object of class mvgam_trend, which contains a list of arguments to be interpreted by the parsing functions in mvgam

Examples

if (FALSE) {
# Simulate counts of four species over ten sampling locations
site_dat <- data.frame(site = rep(1:10, 4),
                      species = as.factor(sort(rep(letters[1:4], 10))),
                      y = c(NA, rpois(39, 3)))
head(site_dat)

# Set up a correlated residual (i.e. Joint Species Distribution) model,
# where 'site' represents the unit of analysis
trend_model <- ZMVN(unit = site, subgr = species)
mod <- mvgam(y ~ species,
            trend_model = ZMVN(unit = site,
                               subgr = species),
            data = site_dat,
            chains = 2,
            silent = 2)

# Inspect the estimated species-species residual covariances
mcmc_plot(mod, variable = 'Sigma', regex = TRUE, type = 'hist')

# A hierarchical correlation example; set up correlated counts
# for three species across two sampling locations
Sigma <- matrix(c(1, -0.4, 0.5,
                 -0.4, 1, 0.3,
                 0.5, 0.3, 1),
               byrow = TRUE,
               nrow = 3)
               Sigma

make_site_dat = function(...){
 errors <- mgcv::rmvn(n = 30,
                      mu = c(0.6, 0.8, 1.8),
                      V = Sigma)
 site_dat <- do.call(rbind, lapply(1:3, function(spec){
   data.frame(y = rpois(30,
                        lambda = exp(errors[, spec])),
              species = paste0('species',
                               spec),
              site = 1:30)
}))
site_dat
}

site_dat <- rbind(make_site_dat() %>%
                   dplyr::mutate(group = 'group1'),
                 make_site_dat() %>%
                   dplyr::mutate(group = 'group2')) %>%
   dplyr::mutate(species = as.factor(species),
                 group = as.factor(group))

# Fit the hierarchical correlated residual model
mod <- mvgam(y ~ species,
            trend_model = ZMVN(unit = site,
                               gr = group,
                               subgr = species),
            data = site_dat)

# Inspect the estimated species-species residual covariances
mcmc_plot(mod, variable = 'Sigma', regex = TRUE, type = 'hist')
}