The “COVID-19 Cases and Deaths in Criminal Justice Facilities Dataset” is the most comprehensive dataset on the spread of COVID-19 inside prisons in the USA. It contains information on testing, positive cases and deaths on a per-day basis for both inmate and staff populations. The dataset is based entirely off of data made publicly available by individual state and federal corrections organizations.
A constellation of groups including CovidPrisonData.com, UCLA COVID-19 Behind Bars Data Project and Recidiviz have been aggregating this data since the start of the outbreak. Recidiviz’s participation has included manual data collection, historical checks for areas where we’ve found gaps, data aggregation and data quality investments.
If you end up using the data in an interesting way or are interested in volunteering with these data efforts, please reach out at email@example.com.
This dataset is made publicly available but if you use this data, please cite this work as the following.
Kaplan, Jacob, Hoyos-Torres, Sebastian, Gur, Oren, Concannon, Connor, Littman, Aaron, Jones, Nick. Covid-19 in Prisons in the United States. Covid Prison Data, 2020. Retrieved from https://covidprisondata.com/data.html. See also: Dolovich, Sharon. (2020). UCLA Law Covid-19 Behind Bars Data Project. Retrieved from: https://bit.ly/2xyFfX6. Saunders, Jessica. (2020). Covid Custody Project Working File. Raw unpublished data. Recidiviz. (2020). COVID-19 : Prison / Jail Cases. Retrieved from: https://bit.ly/3dgo77t
We’ve updated this page and kicked off internal data quality efforts. To provide transparency around these efforts, we’ve made our backlog of known issues publicly available.
Major changes to the dataset include:
About the dataset
This dataset is aggregated from multiple data sources, and updated nightly.
Each row includes a ‘collections’ field, which lists which data sources contributed to that row’s data. It also has a ‘source’ field, which provides the original source of the data (typically a government website). You can also find the component data at the sources listed in the citation above.
On an on-going basis, we review facilities for data quality. Through this process, we’ve discovered missing data, improperly labeled data (e.g active cases being reported as cumulative cases to date) and other issues. We track these issues in a public backlog.
In addition to our prison data set, we’ve also collected information for a subset of county jails. This data uses the same methodology as our prison data set although we have not yet made similar investments in data quality for jails data.
This dataset brings together the work of several organizations and individuals who have worked tirelessly to preserve a record of COVID-19 outbreaks in prisons and jails, including: