Case study: Measles in large and small towns

Important Note: These materials have been superseded by materials provided as part of the as part of the SISMID short course on Simulation-based Inference.

Licensed under the Creative Commons Attribution-NonCommercial license. Please share and remix noncommercially, mentioning its origin.
CC-BY_NC This document has its origins in the SISMID short course on Simulation-based Inference given by Aaron King and Edward Ionides.

Produced in R version 3.3.1 using pomp version 1.7.5.1.

Objectives

To display a published case study using plug-and-play methods with non-trivial model complexities
To show how extra-demographic stochasticity can be modeled
To demonstrate the use of covariates in pomp
To demonstrate the use of profile likelihood in scientific inference
To discuss the interpretation of parameter estimates
To emphasize the potential need for extra sources of stochasticity in modeling

Measles revisited

Motivation: challenges in inference from disease dynamics

Understanding, forecasting, managing epidemiological systems increasingly depends on models.
Dynamic models can be used to test causal hypotheses.
Real epidemiological systems:
- are nonlinear
- are stochastic
- are nonstationary
- evolve in continuous time
- have hidden variables
- can be measured only with (large) error
Dynamics of infectious disease outbreaks illustrate this well.
Measles is the paradigm for a nonlinear ecological system that can be well described by low-dimensional nonlinear dynamics.
A tradition of careful modeling studies have proposed and found evidence for a number of specific mechanisms, including
- a high value of $R_0$ (c. 15–20)
- seasonality in transmission rates associated with school terms
- a birth cohort effect
- response to changing birth rates
- under-reporting
- fadeouts and reintroductions that scale with city size
- spatial traveling waves
Much of this evidence has been amassed from fitting models to data, using a variety of methods.
See Rohani and King (2010) for a review of some of the high points.

Outline

We revisit a classic measles data set, weekly case reports in 954 urban centers in England and Wales during the pre-vaccine era (1950–1963).
We examine questions regarding:
- measles extinction and recolonization
- transmission rates
- seasonality
- resupply of susceptibles
We use a model that
1. expresses our current understanding of measles dynamics
2. includes a long list of mechanisms that have been proposed and demonstrated in the literature
3. cannot be fit by existing likelihood-based methods
We examine data from large and small towns using the same model, something no existing methods have been able to do.
We ask: does our perspective on this disease change when we expect the models to explain the data in detail?
What bigger lessons can we learn regarding inference for dynamical systems?

He, Ionides, & King, J. R. Soc. Interface (2010)

Data sets

Twenty towns, including
- 10 largest
- 10 smaller, chosen at random
Population sizes: 2k–3.4M
Weekly case reports, 1950–1963
Annual birth records and population sizes, 1944–1963

Continuous-time Markov process model

Diagram of the model:

$B(t) = \text{birth rate, from data}$
$N(t) = \text{population size, from data}$
Overdispersed binomial measurement model: $\mathrm{cases}_t\,\vert\,{{\Delta}{N}}_{IR}=z_t \sim {\mathrm{Normal}\left(\rho\,z_t,\rho\,(1-\rho)\,z_t+(\psi\,\rho\,z_t)^2\right)}$
Entry into susceptible class: $\mu_{BS}(t) = (1-c)\,B(t-\tau)+c\,\delta(t-t_0)\,\int_{t-1}^{t}\,B(t-\tau-s)\,ds$
Force of infection $\mu_{SE}(t) = \tfrac{\beta(t)}{N(t)}\,(I+\iota)\,\zeta(t)$
$c = \text{cohort effect}$
$\tau = \text{school-entry delay}$
$\beta(t) = \text{school-term transmission} = \begin{cases}\beta_0\,(1+a) &\text{during term}\\\beta_0\,(1-a) &\text{during vacation}\end{cases}$
$\iota = \text{imported infections}$
$\zeta(t) = \text{Gamma white noise with intensity}\,\sigma_{SE}$ (He et al. 2010,Bhadra et al. (2011))

Model implementation

We’ll load the packages we’ll need, and set the random seed, to allow reproducibility. Note that we’ll be making heavy use of the data-munging methods in packages plyr and reshape2, a tutorial introduction to which is provided here. Also, we’ll be using ggplot2 for plotting: see this brief tutorial. Finally, we’ll use the convenient magrittr syntax, which is explained here.

library(pomp)
library(plyr)
library(reshape2)
library(magrittr)
library(ggplot2)
theme_set(theme_bw())
stopifnot(packageVersion("pomp")>="1.4.8")
set.seed(594709947L)

Data and covariates

Now we’ll load the data and covariates. The data are measles reports from 20 cities in England and Wales. We also have information on the population sizes and birth-rates in these cities; we’ll treat these variables as covariates.

daturl <- "http://kingaa.github.io/pomp/vignettes/twentycities.rda"
datfile <- file.path(tempdir(),"twentycities.rda")
download.file(daturl,destfile=datfile,mode="wb")
load(datfile)

In the following, we’ll be using the plyr, reshape2, and magrittr packages. These provide a number of tools for data munging that are extremely flexible and useful. A brief tutorial on these tools is provided here.

We select the data for London and pre-process the measles and demography data.

measles %>% 
  mutate(year=as.integer(format(date,"%Y"))) %>%
  subset(town=="London" & year>=1950 & year<1964) %>%
  mutate(time=(julian(date,origin=as.Date("1950-01-01")))/365.25+1950) %>%
  subset(time>1950 & time<1964, select=c(time,cases)) -> dat
demog %>% subset(town=="London",select=-town) -> demogLondon

We plot the data and covariates.

dat %>% ggplot(aes(x=time,y=cases))+geom_line()

demogLondon %>% melt(id="year") %>%
  ggplot(aes(x=year,y=value))+geom_point()+
  facet_wrap(~variable,ncol=1,scales="free_y")

Now, we smooth the covariates. Note that we delay the entry of newborns into the susceptible pool.

demogLondon %>% 
  summarize(
    time=seq(from=min(year),to=max(year),by=1/12),
    pop=predict(smooth.spline(x=year,y=pop),x=time)$y,
    birthrate=predict(smooth.spline(x=year+0.5,y=births),x=time-4)$y
  ) -> covar

plot(pop~time,data=covar,type='l')
points(pop~year,data=demogLondon)

plot(birthrate~time,data=covar,type='l')
points(births~year,data=demogLondon)

plot(birthrate~I(time-4),data=covar,type='l')
points(births~I(year+0.5),data=demogLondon)

The partially observed Markov process model

The (unobserved) process model

We propose a variant of the SEIR model as an explanation for these data. This is a compartmental model that, diagrammatically, looks as follows.

model diagram

$B = \text{births}$
$S = \text{susceptibles}$
$E = \text{exposed, incubating}$
$I = \text{infectious}$
$R = \text{recovered}$

We require a simulator for this model. The following code implements a simulator.

Notable complexities include:

Incorporation of the known birthrate.
The birth-cohort effect: a specified fraction (cohort) of the cohort enter the susceptible pool aall at once.
Seasonality in the transmission rate: during school terms, the transmission rate is higher than it is during holidays.
Extra-demographic stochasticity in the form of a Gamma white-noise term acting multiplicatively on the force of infection.
Demographic stochasticity implmented using Euler-multinomial distributions.

rproc <- Csnippet("
  double beta, br, seas, foi, dw, births;
  double rate[6], trans[6];
  
  // cohort effect
  if (fabs(t-floor(t)-251.0/365.0) < 0.5*dt) 
    br = cohort*birthrate/dt + (1-cohort)*birthrate;
  else 
    br = (1.0-cohort)*birthrate;

  // term-time seasonality
  t = (t-floor(t))*365.25;
  if ((t>=7&&t<=100) || (t>=115&&t<=199) || (t>=252&&t<=300) || (t>=308&&t<=356))
      seas = 1.0+amplitude*0.2411/0.7589;
    else
      seas = 1.0-amplitude;

  // transmission rate
  beta = R0*(gamma+mu)*seas;
  // expected force of infection
  foi = beta*pow(I+iota,alpha)/pop;
  // white noise (extrademographic stochasticity)
  dw = rgammawn(sigmaSE,dt);

  rate[0] = foi*dw/dt;  // stochastic force of infection
  rate[1] = mu;             // natural S death
  rate[2] = sigma;        // rate of ending of latent stage
  rate[3] = mu;             // natural E death
  rate[4] = gamma;        // recovery
  rate[5] = mu;             // natural I death

  // Poisson births
  births = rpois(br*dt);
  
  // transitions between classes
  reulermultinom(2,S,&rate[0],dt,&trans[0]);
  reulermultinom(2,E,&rate[2],dt,&trans[2]);
  reulermultinom(2,I,&rate[4],dt,&trans[4]);

  S += births   - trans[0] - trans[1];
  E += trans[0] - trans[2] - trans[3];
  I += trans[2] - trans[4] - trans[5];
  R = pop - S - E - I;
  W += (dw - dt)/sigmaSE;  // standardized i.i.d. white noise
  C += trans[4];           // true incidence
")

In the above, $C$ represents the true incidence, i.e., the number of new infections occurring over an interval. Since recognized measles infections are quarantined, we argue that most infection occurs before case recognition so that true incidence is a measure of the number of individuals progressing from the I to the R compartment in a given interval.

We complete the process model definition by specifying the distribution of initial unobserved states. The following codes assume that the fraction of the population in each of the four compartments is known.

initlz <- Csnippet("
  double m = pop/(S_0+E_0+I_0+R_0);
  S = nearbyint(m*S_0);
  E = nearbyint(m*E_0);
  I = nearbyint(m*I_0);
  R = nearbyint(m*R_0);
  W = 0;
  C = 0;
")

The measurement model

We’ll model both under-reporting and measurement error. We want $\mathbb{E}[\text{cases}|C] = \rho\,C$ , where $C$ is the true incidence and $0<\rho<1$ is the reporting efficiency. We’ll also assume that $\mathrm{Var}[\text{cases}|C] = \rho\,(1-\rho)\,C + (\psi\,\rho\,C)^2$ , where $\psi$ quantifies overdispersion. Note that when $\psi=0$ , the variance-mean relation is that of the binomial distribution. To be specific, we’ll choose $\text{cases|C} \sim f(\cdot|\rho,\psi,C)$ , where $f(c|\rho,\psi,C) = \Phi(c+\tfrac{1}{2},\rho\,C,\rho\,(1-\rho)\,C+(\psi\,\rho\,C)^2)-\Phi(c-\tfrac{1}{2},\rho\,C,\rho\,(1-\rho)\,C+(\psi\,\rho\,C)^2),$ where $\Phi(x,\mu,\sigma^2)$ is the c.d.f. of the normal distribution with mean $\mu$ and variance $\sigma^2$ .

The following computes $\mathbb{P}[\text{cases}|C]$ .

dmeas <- Csnippet("
  double m = rho*C;
  double v = m*(1.0-rho+psi*psi*m);
  double tol = 1.0e-18;
  if (cases > 0.0) {
    lik = pnorm(cases+0.5,m,sqrt(v)+tol,1,0)-pnorm(cases-0.5,m,sqrt(v)+tol,1,0)+tol;
  } else {
    lik = pnorm(cases+0.5,m,sqrt(v)+tol,1,0)+tol;
  }
")

The following codes simulate $\text{cases} | C$ .

rmeas <- Csnippet("
  double m = rho*C;
  double v = m*(1.0-rho+psi*psi*m);
  double tol = 1.0e-18;
  cases = rnorm(m,sqrt(v)+tol);
  if (cases > 0.0) {
    cases = nearbyint(cases);
  } else {
    cases = 0.0;
  }
")

Constructing the `pomp` object

We put all the model components together with the data in a call to pomp:

dat %>% 
  pomp(t0=with(dat,2*time[1]-time[2]),
       time="time",
       rprocess=euler.sim(rproc,delta.t=1/365.25),
       initializer=initlz,
       dmeasure=dmeas,
       rmeasure=rmeas,
       covar=covar,
       tcovar="time",
       zeronames=c("C","W"),
       statenames=c("S","E","I","R","C","W"),
       paramnames=c("R0","mu","sigma","gamma","alpha","iota",
                    "rho","sigmaSE","psi","cohort","amplitude",
                    "S_0","E_0","I_0","R_0")
  ) -> m1

The following codes plot the data and covariates together.

m1 %>% as.data.frame() %>% 
  melt(id="time") %>%
  ggplot(aes(x=time,y=value))+
  geom_line()+
  facet_grid(variable~.,scales="free_y")

He et al. (2010) estimated the parameters of this model. The full set is included in the R code accompanying this document, where they are read into a data frame called mles.

mles %>% subset(town=="London") -> mle
paramnames <- c("R0","mu","sigma","gamma","alpha","iota",
                "rho","sigmaSE","psi","cohort","amplitude",
                "S_0","E_0","I_0","R_0")
mle %>% extract(paramnames) %>% unlist() -> theta
mle %>% subset(select=-c(S_0,E_0,I_0,R_0)) %>%
  knitr::kable(row.names=FALSE)

town	loglik	loglik.sd	mu	delay	sigma	gamma	rho	R0	amplitude	alpha	iota	cohort	psi	sigmaSE
London	-3804.9	0.16	0.02	4	28.9	30.4	0.488	56.8	0.554	0.976	2.9	0.557	0.116	0.0878

Verify that we get the same likelihood as He et al. (2010).

library(foreach)
library(doParallel)

registerDoParallel()

set.seed(998468235L,kind="L'Ecuyer")

foreach(i=1:4,
        .packages="pomp",
        .options.multicore=list(set.seed=TRUE)
) %dopar% {
  pfilter(m1,Np=10000,params=theta)
} -> pfs
logmeanexp(sapply(pfs,logLik),se=TRUE)

##                          se 
## -3800.9542945     0.3404236

Simulations at the MLE:

m1 %>% 
  simulate(params=theta,nsim=9,as.data.frame=TRUE,include.data=TRUE) %>%
  ggplot(aes(x=time,y=cases,group=sim,color=(sim=="data")))+
  guides(color=FALSE)+
  geom_line()+facet_wrap(~sim,ncol=2)

Parameter transformations

The parameters are constrained to be positive, and some of them are constrained to lie between $0$ and $1$ . We can turn the likelihood maximization problem into an unconstrained maximization problem by transforming the parameters. The following Csnippets implement such a transformation and its inverse.

toEst <- Csnippet("
  Tmu = log(mu);
  Tsigma = log(sigma);
  Tgamma = log(gamma);
  Talpha = log(alpha);
  Tiota = log(iota);
  Trho = logit(rho);
  Tcohort = logit(cohort);
  Tamplitude = logit(amplitude);
  TsigmaSE = log(sigmaSE);
  Tpsi = log(psi);
  TR0 = log(R0);
  to_log_barycentric (&TS_0, &S_0, 4);
")

fromEst <- Csnippet("
  Tmu = exp(mu);
  Tsigma = exp(sigma);
  Tgamma = exp(gamma);
  Talpha = exp(alpha);
  Tiota = exp(iota);
  Trho = expit(rho);
  Tcohort = expit(cohort);
  Tamplitude = expit(amplitude);
  TsigmaSE = exp(sigmaSE);
  Tpsi = exp(psi);
  TR0 = exp(R0);
  from_log_barycentric (&TS_0, &S_0, 4);
")


pomp(m1,toEstimationScale=toEst,
     fromEstimationScale=fromEst,
     statenames=c("S","E","I","R","C","W"),
     paramnames=c("R0","mu","sigma","gamma","alpha","iota",
                  "rho","sigmaSE","psi","cohort","amplitude",
                  "S_0","E_0","I_0","R_0")) -> m1

Construction of a likelihood profile

The linked document shows how a likelihood profile can be constructed using IF2.

Results from He et al. (2010)

Fitting procedures

A large Latin-hypercube design was used to initiate searches.
Iterated filtering was used to maximize the likelihood.
We obtained point estimates of all parameters for 20 cities.
We constructed profile likelihoods to quantify uncertainty in London and Hastings.

Imported infections

$\text{force of infection} = \mu_{SE}=\frac{\beta(t)}{N(t)}\,(I+\iota)\,\zeta(t)$

Seasonality

Notable findings

Cohort effect

Birth delay

Profile likelihood for birth-cohort delay, showing 95% and 99% critical values of the log likelihood.

Reporting rate

Predicted vs observed critical community size

Problematic results

$R_0$

Recall that $R_0$ is the basic reproduction number: a measure of how communicable an infection is.
Existing estimates of $R_0$ (c. 15–20) come from two sources:
serology surveys
models fit to data using feature-based methods

Parameter estimates

	pop	R0	amplitude	LP	IP	alpha	iota	rho	psi	sigmaSE
Halesworth	2170	33.100	0.381	7.870	2.290	0.948	0.00912	0.754	0.641	0.0748
Lees	4250	29.700	0.153	8.510	2.050	0.968	0.03110	0.612	0.681	0.0802
Mold	6410	21.400	0.271	5.930	1.780	1.040	0.01450	0.131	2.870	0.0544
Dalton.in.Furness	10600	28.300	0.203	5.480	1.980	0.989	0.03860	0.455	0.818	0.0779
Oswestry	11000	52.900	0.339	10.300	2.710	1.040	0.02980	0.631	0.476	0.0699
Northwich	18300	30.100	0.423	8.510	3.020	0.948	0.06020	0.795	0.402	0.0857
Bedwellty	28900	24.700	0.160	6.820	3.030	0.937	0.03960	0.311	0.951	0.0611
Consett	39100	35.900	0.200	9.080	2.660	1.010	0.07310	0.650	0.406	0.0712
Hastings	65700	34.200	0.299	7.000	5.440	1.000	0.18600	0.695	0.396	0.0955
Cardiff	245000	34.400	0.223	9.870	3.090	0.996	0.14100	0.602	0.270	0.0539
Bradford	294000	32.100	0.236	8.510	3.360	0.991	0.24400	0.599	0.190	0.0451
Hull	302000	38.900	0.221	9.180	5.460	0.968	0.14200	0.582	0.256	0.0636
Nottingham	307000	22.600	0.157	5.720	3.700	0.982	0.17000	0.609	0.258	0.0380
Bristol	443000	26.800	0.203	6.190	4.940	1.010	0.44100	0.626	0.201	0.0392
Leeds	510000	47.800	0.267	9.480	10.900	1.000	1.25000	0.666	0.167	0.0778
Sheffield	515000	33.100	0.313	7.230	6.380	1.020	0.85300	0.649	0.175	0.0428
Manchester	704000	32.900	0.290	11.100	6.940	0.965	0.59000	0.550	0.161	0.0551
Liverpool	802000	48.100	0.305	7.900	9.800	0.978	0.26300	0.494	0.136	0.0533
Birmingham	1120000	43.400	0.428	8.510	11.600	1.010	0.34300	0.544	0.178	0.0611
London	3390000	56.800	0.554	13.100	12.500	0.976	2.90000	0.488	0.116	0.0878
r	NA	0.455	0.301	0.322	0.946	0.106	0.93200	-0.203	-0.931	-0.3260

$r = \mathrm{cor}(\log{\hat\theta},\log{N_{1950}})$

Extrademographic stochasticity

$\mu_{SE}=\frac{\beta(t)}{N(t)}\,(I+\iota)\,\zeta(t)$

Questions

What does it mean that parameter estimates from the fitting disagree with estimates from other data?
How can one interpret the correlation between infectious period and city size in the parameter estimates?
How do we interpret the need for extrademographic stochasticity in this model?

Simulations at the MLE:

m1 %>% 
  simulate(params=theta,nsim=100,as.data.frame=TRUE,include.data=TRUE) %>%
  subset(select=c(time,sim,cases)) %>%
  mutate(data=sim=="data") %>%
  ddply(~time+data,summarize,
        p=c(0.05,0.5,0.95),q=quantile(cases,prob=p,names=FALSE)) %>%
  mutate(p=mapvalues(p,from=c(0.05,0.5,0.95),to=c("lo","med","hi")),
         data=mapvalues(data,from=c(TRUE,FALSE),to=c("data","simulation"))) %>%
  dcast(time+data~p,value.var='q') %>%
  ggplot(aes(x=time,y=med,color=data,fill=data,ymin=lo,ymax=hi))+
  geom_ribbon(alpha=0.2)

Exercises

Exercise: Reformulate the model

Modify the He et al. (2010) model to remove the cohort effect. Run simulations and compute likelihoods to convince yourself that the resulting codes agree with the original ones for cohort = 0.

Now modify the transmission seasonality to use a sinusoidal form. How many parameters must you use? Fixing the other parameters at their MLE values, compute and visualize a profile likelihood over these parameters.

Exercise: Extrademographic stochasticity

Set the extrademographic stochasticity parameter $\sigma_{SE}=0$ , set $\alpha=1$ , and fix $\rho$ and $\iota$ at their MLE values, then maximize the likelihood over the remaining parameters. How do your results compare with those at the MLE? Compare likelihoods but also use simulations to diagnose differences between the models.

Back to course homepage

R codes for this document

Profile likelihood computation for this example

References

Bhadra, A., E. L. Ionides, K. Laneri, M. Pascual, M. Bouma, and R. Dhiman. 2011. Malaria in Northwest India: Data analysis via partially observed stochastic differential equation models driven by Lévy noise. Journal of the American Statistical Association 106:440–451.

He, D., E. L. Ionides, and A. A. King. 2010. Plug-and-play inference for disease dynamics: Measles in large and small populations as a case study. Journal of the Royal Society, Interface 7:271–283.

Rohani, P., and A. A. King. 2010. Never mind the length, feel the quality: The impact of long-term epidemiological data sets on theory, application and policy. Trends in Ecology & Evolution 25:611–618.

Case study: Measles in large and small towns

Aaron A. King

Objectives

Measles revisited

Motivation: challenges in inference from disease dynamics

Outline

He, Ionides, & King, J. R. Soc. Interface (2010)

Data sets

Continuous-time Markov process model

Model implementation

Data and covariates

The partially observed Markov process model

The (unobserved) process model

The measurement model

Constructing the pomp object

Parameter transformations

Construction of a likelihood profile

Results from He et al. (2010)

Fitting procedures

Imported infections

Seasonality

Notable findings

Cohort effect

Birth delay

Reporting rate

Predicted vs observed critical community size

Problematic results

R0R_0

Parameter estimates

Extrademographic stochasticity

Questions

Exercises

Exercise: Reformulate the model

Exercise: Extrademographic stochasticity

Back to course homepage

R codes for this document

Profile likelihood computation for this example

References

Constructing the `pomp` object

$R_0$