Processing math: 100%
Coding POMP models: R vs C snippets

Coding POMP models: R vs C snippets

The R codes in this script are available for download. This work is licensed under the Creative Commons attribution-noncommercial license. Please share and remix noncommercially, mentioning its origin.
CC-BY_NC


To implement models in pomp, we have various options. While it is very easy to code basic model components using R functions, it can be much faster to do it using C snippets. Here, we’ll demonstrate both approaches.

First, we’ll load some packages.

library(pomp)
library(ggplot2)
library(tidyr)

The Gompertz model is a very simple partially observed Markov process (POMP) model that we’ll use to demonstrate.

The latent state variable, X represents the density of a population of biological organisms. The state equation is Xt+δ=K1SXStϵt, where S=erδ is a parameter and the ϵt are i.i.d. lognormal random deviates with variance σ2: ϵtLognormal(0,σ). The observed variables Yt are lognormally distributed: YtLognormal(logXt,τ).

Implementation using R functions

To begin with, we’ll code the latent state process model and both components of the measurement model using R functions.

simulate(times=1:100,t0=0,
  params=c(K=1,r=0.1,sigma=0.1,tau=0.1,X.0=1),
  rprocess=discrete_time(
    step.fun=function (X,r,K,sigma,...,delta.t) {
      eps <- exp(rnorm(n=1,mean=0,sd=sigma))
      S <- exp(-r*delta.t)
      c(X=K^(1-S)*X^S*eps)
    },
    delta.t=1 
  ),
  rmeasure=function (X, tau, ...) {
    c(Y=rlnorm(n=1,meanlog=log(X),sdlog=tau))
  },
  dmeasure=function (tau, X, Y, ..., log) {
    dlnorm(x=Y,meanlog=log(X),sdlog=tau,log=log)
  }
) -> gompertz

We plot the results.

gompertz |>
  as.data.frame() |>
  pivot_longer(c(X,Y)) |>
  ggplot(aes(x=time,y=value,color=name))+
  geom_line()+
  labs(y="X, Y")+
  theme_bw()

Implementation using C snippets

Now we’ll code up the same example using C snippets.

simulate(
  times=0:100,t0=0,
  params=c(K=1,r=0.1,sigma=0.1,tau=0.1,X.0=1),
  dmeasure=Csnippet("
    lik = dlnorm(Y,log(X),tau,give_log);"
  ),
  rmeasure=Csnippet("
    Y = rlnorm(log(X),tau);"
  ),
  rprocess=discrete_time(
    step.fun=Csnippet("
    double S = exp(-r*dt);
    double logeps = (sigma > 0.0) ? rnorm(0,sigma) : 0.0;
    X = pow(K,(1-S))*pow(X,S)*exp(logeps);"
    ),
    delta.t=1
  ),
  paramnames=c("r","K","sigma","tau"),
  obsnames="Y",
  statenames="X"
) -> Gompertz

Testing the implementations

Having coded the model, it’s a good idea to run some simple tests to check the implementation. We can run simulations and a particle filter operation to check that the rprocess, rmeasure, and dmeasure components function without error.

First, some simulations, from several initial conditions:

Gompertz |>
  coef() |>
  parmat(4) -> p
p["X.0",] <- c(0.5,0.9,1.1,1.5)
Gompertz |>
  simulate(params=p,format="data.frame") |>
  ggplot(aes(x=time,y=X,group=.id,color=.id))+
  geom_line()+
  guides(color="none")+
  theme_bw()+
  labs(title="Gompertz model",subtitle="stochastic simulations")

We run 10 replicate particle filter operations at the true parameter values as follows. This allows us to assess the Monte Carlo error in the likelihood estimate.

Gompertz |>
  pfilter(Np=500) |>
  replicate(n=10) -> pf

logmeanexp(sapply(pf,logLik),se=TRUE,ess=TRUE)
##        est         se        ess 
## 62.8137837  0.1679958  8.0546373

Comparing the implementations

Using each implementation, we’ll now run a number of simulations and compare the amount of time required.

system.time(simulate(gompertz,nsim=10000,format="arrays"))
##    user  system elapsed 
##   4.669   0.004   4.674
system.time(simulate(Gompertz,nsim=10000,format="arrays"))
##    user  system elapsed 
##   0.138   0.001   0.139
system.time(pfilter(gompertz,Np=10000))
##    user  system elapsed 
##   5.151   0.001   5.155
system.time(pfilter(Gompertz,Np=10000))
##    user  system elapsed 
##   0.176   0.002   0.178

This document was produced using pomp version 5.10 and R version 4.4.1.


NSF
NCEAS
NIH

This software has been made possible by support from the U.S. National Science Foundation (Grants #EF-0545276, #EF-0430120), by the “Inference for Mechanistic Models” Working Group supported by the National Center for Ecological Analysis and Synthesis (a Center funded by N.S.F. (Grant #DEB-0553768), the University of California, Santa Barbara, and the State of California), and by the RAPIDD program of the Science & Technology Directorate, Department of Homeland Security and the Fogarty International Center, U.S. National Institutes of Health.