Set up latent correlated multivariate Gaussian residual processes in mvgam. This function does not evaluate it's arguments – it exists purely to help set up a model with particular error processes.
Arguments
- unit
The unquoted name of the variable that represents the unit of analysis in
data
over which latent residuals should be correlated. This variable should be either anumeric
orinteger
variable in the supplieddata
. Defaults totime
to be consistent with other functionalities in mvgam, though note that the data need not be time series in this case. See examples below for further details and explanations- gr
An optional grouping variable, which must be a
factor
in the supplieddata
, for setting up hierarchical residual correlation structures. If specified, this will automatically set up a model where the residual correlations for a specific level ofgr
are modelled hierarchically: \(\Omega_{group} = p\Omega_{global} + (1 - p)\Omega_{group, local}\), where \(\Omega_{global}\) is a global correlation matrix, \(\Omega_{group, local}\) is a local deviation correlation matrix and \(p\) is a weighting parameter controlling how strongly the local correlation matrix \(\Omega_{group}\) is shrunk towards the global correlation matrix \(\Omega_{global}\). Ifgr
is supplied thensubgr
must also be supplied- subgr
A subgrouping
factor
variable specifying which element indata
represents the different observational units. Defaults toseries
to be consistent with other functionalities in mvgam, though note that the data need not be time series in this case. But note that models that use the hierarchical correlations (by supplying a value forgr
) should not include aseries
element indata
. Rather, this element will be created internally based on the supplied variables forgr
andsubgr
. For example, if you are modelling counts for a group of species (labelled asspecies
in the data) across sampling sites (labelled assite
in the data) in three different geographical regions (labelled asregion
), and you would like the residuals to be correlated within regions, then you should specifyunit = site
,gr = region
, andsubgr = species
. Internally,mvgam()
will appropriately order the data byunit
(in this case, bysite
) and create theseries
element for the data using something like:series = as.factor(paste0(group, '_', subgroup))
Value
An object of class mvgam_trend
, which contains a list of
arguments to be interpreted by the parsing functions in mvgam
Examples
if (FALSE) {
# Simulate counts of four species over ten sampling locations
site_dat <- data.frame(site = rep(1:10, 4),
species = as.factor(sort(rep(letters[1:4], 10))),
y = c(NA, rpois(39, 3)))
head(site_dat)
# Set up a correlated residual (i.e. Joint Species Distribution) model,
# where 'site' represents the unit of analysis
trend_model <- ZMVN(unit = site, subgr = species)
mod <- mvgam(y ~ species,
trend_model = ZMVN(unit = site,
subgr = species),
data = site_dat,
chains = 2,
silent = 2)
# Inspect the estimated species-species residual covariances
mcmc_plot(mod, variable = 'Sigma', regex = TRUE, type = 'hist')
# A hierarchical correlation example; set up correlated counts
# for three species across two sampling locations
Sigma <- matrix(c(1, -0.4, 0.5,
-0.4, 1, 0.3,
0.5, 0.3, 1),
byrow = TRUE,
nrow = 3)
Sigma
make_site_dat = function(...){
errors <- mgcv::rmvn(n = 30,
mu = c(0.6, 0.8, 1.8),
V = Sigma)
site_dat <- do.call(rbind, lapply(1:3, function(spec){
data.frame(y = rpois(30,
lambda = exp(errors[, spec])),
species = paste0('species',
spec),
site = 1:30)
}))
site_dat
}
site_dat <- rbind(make_site_dat() %>%
dplyr::mutate(group = 'group1'),
make_site_dat() %>%
dplyr::mutate(group = 'group2')) %>%
dplyr::mutate(species = as.factor(species),
group = as.factor(group))
# Fit the hierarchical correlated residual model
mod <- mvgam(y ~ species,
trend_model = ZMVN(unit = site,
gr = group,
subgr = species),
data = site_dat)
# Inspect the estimated species-species residual covariances
mcmc_plot(mod, variable = 'Sigma', regex = TRUE, type = 'hist')
}