Licensed under the Creative Commons attribution-noncommercial license. Please share and remix noncommercially, mentioning its origin.
CC-BY_NC

This document was produced using pomp version 1.11.3.1 and R version 3.3.3.

Introduction

This tutorial aims to help you get started using pomp as a suite of tools for analysis of time series data based on dynamical systems models. First, we give some conceptual background regarding the class of models—partially observed Markov processes—that pomp handles. We then discuss some preliminaries: installing the package and so on. Next, using a basic question about ecological population regulation as an example, we load some data and implement some models as R objects of class pomp. Finally, we illustrate some of the package’s capabilities by using its algorithms to fit and compare the models using various inference methods.

Partially observed Markov process (POMP) models

As its name implies pomp is focused on partially observed Markov process models. These are also known as state-space model, hidden Markov models, and stochastic dynamical systems models. Such models consist of an unobserved stochastic state process, connected to the data via an explicit model of the observation process. We refer to the former as the process model and the latter as the measurement model. Each model is, in fact, a probability distribution. Suppose we have a sequence of measurements, \(y^*_{n}\), made at times \(t_n\), \(n=1,\dots,N\). Let \(Y_n\) denote the measurement process and \(X_n\) the state process, then by definition, the process model is determined by the density \(f_{X_n|X_{n-1}}\) and the initial density \(f_{X_0}\). The measurement process is determined by the density \(f_{Y_n|X_n}\). We think of the data, \(y^*_n\), as being a single realization of the \(Y\) process. To implement a POMP model in pomp, we have to specify the measurement and process distributions.

Note however, that, for each of the process and the measurement model there are two distinct operations that we might wish to perform:

  1. we might wish to simulate, i.e., to draw a (pseudo)random sample from the distribution, or
  2. we might wish to evaluate the density itself at given values of \(X_n\) and/or \(Y_n\).

Following the R convention, we refer to the simulation of \(f_{X_n|X_{n-1}}\) as the rprocess component of the POMP model and the evaluation of \(f_{X_n|X_{n-1}}\) as the dprocess component. Similarly, the simulation of \(f_{Y_n|X_n}\) is the rmeasure component while the evaluation of \(f_{Y_n|X_n}\) is the dmeasure component. Methods that make no use of the dprocess component are called “plug-and-play” methods. At present, pomp is focused on such methods, so there is no reason to discuss the dprocess component further. In the following, we will illustrate and explain how one specifies the rprocess, rmeasure, and dmeasure components of a model in pomp. We will illustrate this using some simple examples.


The following is a schematic of the structure of a POMP showing causal relations between the process model, the measurement model, and the data. The key perspective to keep in mind is that the model is to be viewed as the process that generated the data.

schematic1

schematic1


Here is another POMP model schematic, showing dependence among model variables.

The state process, \(X_n\), is Markovian, i.e., \[\mathrm{Prob}[X_n|X_0,\dots,X_{n-1},Y_1,\dots,Y_{n-1}]=\mathrm{Prob}[X_n|X_{n-1}].\] Moreover, the measurements, \(Y_n\), depend only on the state at that time: \[\mathrm{Prob}[Y_n|X_0,\dots,X_{n},Y_1,\dots,Y_{n-1}]=\mathrm{Prob}[Y_n|X_{n}],\] for all \(n=1,\dots,N\).


Preliminaries

Installing the package

To get started, we must install pomp, if it is not already installed. The package can be downloaded from CRAN, but the latest version is always available at the package homepage on Github. See the package website for installation instructions.

Important information for Windows and Mac users.

In this document, we will implement pomp models using the package’s Csnippet facility. This allows the user to write model components using snippets of C code, which is then compiled and linked into a running R session. This typically affords a manyfold increase in computation time. It is possible to avoid Csnippets entirely by writing model components as R functions, but the resulting implementations are typically too slow to be of practical use. To use Csnippets, you must be able to compile C codes. Compilers are not by default installed on Windows or Mac systems, so users of such systems must do a bit more work to make use of pomp’s facilities. The installation instructions on the package website give details.

Loading data

We will illustrate pomp by performing a limited data analysis on a set of bird abundance data. The data are from annual censuses of a population of Parus major (Great Tit) in Wytham Wood, Oxfordshire. They were retrieved as dataset #10163 from the Global Population Dynamics Database version 2 (NERC Centre for Population Biology, Imperial College, 2010). The original source is McCleery and Perrins (1991).

We load the data by doing

parus.dat <- read.csv(text="
                      year,P
                      1960,148
                      1961,258
                      1962,185
                      1963,170
                      1964,267
                      1965,239
                      1966,196
                      1967,132
                      1968,167
                      1969,186
                      1970,128
                      1971,227
                      1972,174
                      1973,177
                      1974,137
                      1975,172
                      1976,119
                      1977,226
                      1978,166
                      1979,161
                      1980,199
                      1981,306
                      1982,206
                      1983,350
                      1984,214
                      1985,175
                      1986,211"
                      )

Here is a plot of these data:

ggplot(data=parus.dat,mapping=aes(x=year,y=P))+
  geom_line()+geom_point()+
  expand_limits(y=0)+
  theme_bw()