Andrew Warren and Justine Kunz
May 4, 2020

Comparing the COVID-19 Incarceration Model to Real-World Outbreaks

A month ago, we released a model for Covid-19 outbreaks in prisons and jails. It re-implemented a common model for disease spread¹ but with Covid-19-specific assumptions, and variables that reflect the confined environments of prisons and jails.

We had to make a lot of assumptions — research on the spread of droplet-based disease in prisons and jails is extremely limited, and a lot was (and is) still unknown about Covid-19. But nothing else was available and criminal justice leaders had critical, urgent decisions to make², so we roped in some top minds from the criminal justice and epidemiology communities and released the first version of the model 72hrs later. We published it in Excel, so that anyone could trace exactly what our assumptions were and how calculations were made.

Since then, a lot has changed. For one, we moved the model to the web, both to handle more complex math and to make it easier to model multiple facilities side-by-side. (Like the earlier model and everything else we do, the web model is open source and available publicly for anyone to read up on and check assumptions.) Also, we’ve continued to update the model in response to feature requests from the criminal justice community and to new developments in our understanding of Covid-19.

Testing the model against real-world data

Now, three weeks later, the situations that we built the model to project are no longer hypothetical — the top 10 coronavirus clusters in the US are prisons, meat packing plants, and a Navy ship. Staff and residents of prisons and jails across the country have now tested positive for Covid-19; many facilities have seen significant outbreaks and some have flattened the curve, as you’ll see in the data below.

As outbreaks occur, we’ve been checking the model against real-world numbers from across the country to help us continue to calibrate the model and to make it more useful now and during future waves of Covid-19 and other droplet-borne diseases.

The short version is that, as scary as the model’s numbers seem, they’ve proven conservative against the real-world outbreaks we’ve compared them to. But the story is more nuanced than that; read below for details.

Getting to ground truth

To compare the model to real-world spread in facilities, we first need data on the spread of Covid-19 in prisons and jails throughout the outbreak so far — the ‘ground truth’ that we’ll compare model output to.

We’re working closely with several jurisdictions on Covid-19, but we’ve used publicly-available data for this analysis, so that we can share these results more broadly³.

Evaluation setup

With publicly-available ground truth data, we’re going to run the model in two different scenarios for each of ten facilities:

  1. An optimistic scenario, which assumes the facility is at 90% of capacity, and has 0% of its population in bunks (these parameters in the model correspond to an R0 of 1.6 for the staff and population in cells, 2.3 for those in bunks)
  2. A pessimistic scenario, which instead assumes 100% of capacity, 50% in bunks (these parameters correspond to a higher rate of spread, R0 of 2.2, and 4.3)

(The model makes it easy for government staff to update these variables for their own systems, and can support both higher and lower rates of spread — but we still want to stick to publicly-available data here, so we’re going to use the same moderately ‘optimistic’ and ‘pessimistic’ parameters for each test⁴.)

Now that we have ground truth data and a set of common parameters for the optimistic and pessimistic scenarios, we’re going to take the output of the model for each scenario and compare it to the cases and deaths observed in each of ten facilities.

Validation results

When we initially released the model, we received feedback from some agencies in the criminal justice space that the model seemed too aggressive. The model predicted infection rates and deaths that seemed unlikely in secure, access-restricted environments.

As we began testing the model against real-world numbers, we found two situations that occurred in different environments. We’ve picked some examples of them below.

Scenario 1: Consistent with model

In many facilities, the actual number of cases exceeded our pessimistic scenario early on, but then transitioned to sit in between the optimistic and pessimistic scenarios.

Image for post
Cases in Michigan's Parnall Correctional Facility, April 2020

For example, in Michigan’s Parnall Correctional Facility cases took off quickly, before tapering down to a more modest, linear increase.

Another example of this occurs in Illinois’ Stateville Correctional Center:

Image for post
Cases in Illinois’ Stateville Correctional Center, April 2020

As you can see here, the model is performing reasonably well.

It’s optimistic early on because it starts when the first case is reported, not when it occurs. Officials don’t always detect the first case (it may be asymptomatic, or have had delayed symptoms onset) and the delay results in the model also being delayed in projecting out the increase in cases.

More positively, cases evening off over the longer term might suggest that the facility has managed to reduce the rate of spread over time, potentially to zero. This occurs in both of the facilities above, but isn’t always the case, as Michigan’s Lakeland facility shows.

Image for post
Cases in Michigan’s Lakeland Correctional Facility, April 2020

Scenario 2: More aggressive than model

As we reviewed more facilities, several more troubling examples emerged. For instance, in the Marion Correctional Facility in Ohio, cases outpaced the model early, and quickly reached two-thirds of the facility population.

Image for post
Cases in Ohio’s Marion Correctional Institution, April 2020

Similar rapid growth occurred in another Ohio facility, the Pickaway Correctional Institution.

Image for post
Cases in Ohio’s Pickaway Correctional Institution, April 2020

These are both facilities where the Ohio instituted full-facility testing on April 17th. At the time, both facilities had only confirmed case numbers in the low-hundreds. When the tests came back, 60–66% of these facilities had tested positive, pushing case numbers into the thousands.

Mass testing at other facilities has also yielded significant jumps in case counts. In Vermont, the entire population of the Northwest State Correctional Facility was tested on April 8, after the first resident was found Covid-19 positive. Symptoms-based testing had found one case; mass testing found 31 more. In Arkansas, the first case was identified in the Cummins Unit on April 11th, and by April 13th mass testing had found 43 more cases in the same barracks. A trend was emerging — when mass testing was applied, numbers increased much more quickly than we’d expected.

Following fatality rates

One of the challenges of modeling Covid-19 is the delayed onset of symptoms. In some individuals, symptoms are very mild; in others, they can take several days to weeks to develop; and in nearly 20% of cases symptoms may never develop at all.

Similarly, there are differences in testing routines. Some facilities detect cases early on, while others may have missed several initial cases and only become aware of the outbreak after the disease has spread. Some organizations test the entire facility, while others test only individuals exhibiting symptoms.

One metric that can’t mislead is the final outcome, however. In a subset of cases, Covid-19 is a fatal disease. Our model uses established fatality rates by age bracket, and in the validation exercises above we applied the incarcerated population age distribution from the federal prison system to the population of each facility.

Here are the number of deaths in each facility, compared to model scenarios:

Image for post
Deaths in Ohio’s Marion Correctional Institution, April 2020
Image for post
Deaths in Ohio’s Pickaway Correctional Institution, April 2020
Image for post
Deaths in Michigan’s Lakeland Correctional Facility, April 2020
Image for post
Deaths in Michigan’s Parnall Correctional Facility, April 2020
Image for post
Deaths in Illinois’ Stateville Correctional Center, April 2020

In most of the facilities with outbreaks, deaths have occurred sooner than the model expected given the reported first case, and have followed a path close to our pessimistic scenario.


It’s difficult to know the true state of Covid-19 in prisons and jails today. Based on what we’ve found, it’s likely that cases are being undercounted in facilities that aren’t performing mass testing.

In spite of early concerns that the model may be too pessimistic, our findings so far indicate that the reverse may be true — when actual case numbers are found with complete testing, more members of the population are Covid-19 positive and likely spreading than we or the states had anticipated. And although symptoms-based testing may mask the true extent of an outbreak, the deaths that result will still occur.

There are examples of facilities that have detected early cases quickly, and kept significant spread at bay, without mass testing⁵ — but given the experiences of NWSCF in Vermont, and Cummins Unit in Arkansas, containment seems much more likely in situations where mass testing is used early and often throughout the outbreak⁶. And there’s real reason to think this can help:

Image for post
Deaths in Vermont’s Northwest State Correctional Facility, April 2020

Vermont performed mass testing immediately after the first case, kept testing up over the coming weeks, and isolated anyone new who tested positive. And it had no deaths, a complete reversal compared to both the model projections and the situation in the other facilities shown above.

Our model is now learning from actual case counts and deaths confirmed in each facility (more on that in a subsequent post). We continuously make revisions to it as we learn new information, and have recalibrated the model to better fit the outbreaks that have occurred so far, for example, by decreasing the overall time it takes the virus to spread from the infectious period all the way to recovery or death. Most of these differences have either been adjusted for in the current web version, or are in the process of being added. For now, we hope that posting this exercise is helpful for criminal justice organizations on the front lines of this pandemic.

This analysis, of course, isn’t comprehensive. We’ll be continuing to run these analyses, so if there’s a jurisdiction that you’d like to see included in future posts please let us know at

Thank you to the broad constellation of collaborators — from individuals to nonprofits to prison staff — who have taken the time to offer feedback, contribute to the model, and help build a better understanding of how COVID-19 is spreading in prisons and jails.



  1. Specifically, the SEIR compartmental model.
  2. We’ve since seen some significant efforts at modeling outbreaks in jails and in ICE detention centers.
  3. More on ground truth data soon. The data used is a product of the UCLA Law Covid-19 Behind Bars Data Project and additional work by the Council of State Governments Justice Center and Recidiviz.
  4. We make a few more assumptions to make this exercise possible using publicly-available numbers. In particular, we use the facility capacity + staffing as the current population in the model.
  5. Pennsylvania’s SCI Phoenix location is a prototypical example of this.
  6. PCR testing (nasal swabs) has a sensitivity of ~65%, so multiple rounds are likely to be needed to have confidence that all infected individuals have been found and quarantined.

Copyright © 2017, Recidiviz. All Rights Reserved.