Exercises and associated data

The data and modelling objects created in this notebook can be downloaded directly to save computational time.


Users who wish to complete the exercises can download a small template R script. Assuming you have already downloaded the data objects above, this script will load all data objects so that the steps used to create them are not necessary to tackle the exercises.

Load libraries and time series data

This tutorial relates to content covered in Lecture 1 and Lecture 2, and relies on the following packages for manipulating data, shaping time series, fitting dynamic regression models and plotting:

library(dplyr)
library(mvgam) 
library(gratia)
library(ggplot2); theme_set(theme_classic())
library(marginaleffects)
library(viridis)

We will work with time series of rodent captures from the Portal Project, a long-term monitoring study based near the town of Portal, Arizona. Researchers have been operating a standardized set of baited traps within 24 experimental plots at this site since the 1970’s. Sampling follows the lunar monthly cycle, with observations occurring on average about 28 days apart. However, missing observations do occur due to difficulties accessing the site (weather events, COVID disruptions etc…). You can read about the sampling protocol in this preprint by Ernest et al on the Biorxiv.

Portal Project sampling scheme in the desert near Portal, Arizona, USA; photo by SKM Ernest
Portal Project sampling scheme in the desert near Portal, Arizona, USA; photo by SKM Ernest


All data from the Portal Project are made openly available in near real-time so that they can provide maximum benefit to scientific research and outreach (a set of open-source software tools make data readily accessible). These data are extremely rich, containing monthly counts of rodent captures for >20 species. But rather than accessing the raw data, we will use some data that I have already processed and put into a simple, usable form

data("portal_data")

As the data come pre-loaded with mvgam, you can read a little about it in the help page using ?portal_data. Before working with data, it is important to inspect how the data are structured. There are various ways to inspect data in R; I typically find the the glimpse() function in dplyr to be useful for understanding how variables are structured

dplyr::glimpse(portal_data)
## Rows: 320
## Columns: 5
## $ time      <int> 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, …
## $ series    <fct> DM, DO, PB, PP, DM, DO, PB, PP, DM, DO, PB, PP, DM, DO, PB, …
## $ captures  <int> 20, 2, 0, 0, NA, NA, NA, NA, 36, 5, 0, 0, 40, 3, 0, 1, 29, 3…
## $ ndvi_ma12 <dbl> -0.172144125, -0.172144125, -0.172144125, -0.172144125, -0.2…
## $ mintemp   <dbl> -0.79633807, -0.79633807, -0.79633807, -0.79633807, -1.33471…

We will analyse time series of captures for one specific rodent species, the Desert Pocket Mouse Chaetodipus penicillatus. This species is interesting in that it goes into a kind of “hibernation” during the colder months, leading to very low captures during the winter period