Stochastic model of the spread of ebola

A few weeks ago I heard Oskar Hallatschek from the physics department give a talk about the spatial dynamics of evolutionary spread. He was modeling evolutionary drift, but the stochastic models can apply to epidemics too: the analogy is between mutations popping up in nearby populations and disease outbreaks. The talk was very timely considering the ebola crisis. In my mind I keep imagining a haunting series of animated simulations of spread events that he showed (you can see a static picture of the end result here).  I’m curious how the dynamics of ebola relate to these stochastic models.  It’s intuitively obvious that the way outbreaks spread depends on how many people are around, how contagious the disease is, etc, but it’s not so clear how to model and estimate these parameters.

The best study of the dynamics of this ebola outbreak that I’ve seen so far came out of PLOS.  People can speculate about what will happen all day, but I always appreciate when there’s some data backing up speculations.  Plus, their data visualizations were nice enough to be shown as evidence in Congress the other day.  Using standard stochastic models for epidemics, the authors estimated the basic reproductive number in West Africa using data from the beginning of July.  They found the reproductive number is around R0 = 1.8 which means that on average, one person with ebola will generate 1.8 more ebola cases.  In this model if R0 > 1, then the disease will tend to spread and become an epidemic.

Then, after the models for local outbreaks in West Africa were selected, the authors combined the models with airline data to estimate the spread of ebola internationally via simulations.  This is something that models like Hallatschek’s evolutionary spread dynamics and typical epidemic models aren’t able to capture because it incorporates “importation events”, big jumps across distances that wouldn’t be possible but for modern transportation.  The authors considered two cases, if the Nigeria outbreak would be contained during the timeline of their simulations and if it were not (fortunately it has since been contained).  They estimated both the probability of importation events (see the plots below) and the number of cases generated by an importation event in each country.  Judging by their results, it seems like the spread of ebola across the globe is quite likely, but vigilant public health efforts will be able to prevent a large number of cases.

This study came out more than 6 weeks ago so I’d be interested to see updated projections given the new data since then.

Food network flows in the U.S.

This network visualization caught my eye on Twitter so I just had to investigate.  It displays the volume of food imports and exports between states in 2007.  They do several nice things in this plot: use both color and position on the circle to indicate total volume for a state, rather than just naively plotting in alphabetical order (the Austria First! principle), use line width to indicate volume of trade between states, and use the line colors to indicate whether the trade is incoming or outgoing.

The paper that this plot comes from is a straightforward mathematical analysis of the food commodity trade in the U.S.   The paper is sterile in the best way possible: they don’t make any claims about what the properties of the network mean beyond the mathematical estimates.  There are no modeling assumptions or p-values, just observations about the way food moves throughout the nation.  They don’t overstate the importance of their findings.  I like it.

They bring up two interesting points about the strength (volume of trade in and out) of the nodes (trade regions):

  • Distribution of node neighbors: The distribution of global trade for all commodities is scale-free, meaning that the proportion of trade centers with greater than k trade neighbors decays approximately like (1/k)^2.  This implies that there are many nodes with a few neighbors and a few nodes with many neighbors.  Instead, the distribution of food trade in the U.S. is approximately normal, meaning that there are many nodes with a moderate number of trade neighbors and few nodes with lots or few neighbors.  This distribution is more characteristic of a social network.
  • Node strength rankings:  The places with greatest inward flow strength include New Orleans-Metairi-Bogalusa, Texas, Los Angeles-Long Beach-Riverside, and Chicago-Naperville-Michigan City.  That’s presumably because they’re hubs for railroads and ports, so lots of things get sent there and leave the country rather than go to other states.  Conversely, the places with greatest outward flow strength include Iowa, Illinois, Missouri, and Nebraska.  They probably send out enormous volumes of corn and soy, America’s favorite cash crops.

There’s nothing particularly surprising about these results, but it’s nice to see them presented in a clear way supported by the numbers.