Looking at John Snow’s cholera map from the XXIst Century

This repository contains the data, code and text for the practical chapter on reproducibility and open science in the book "Regional Research Frontiers".

Analysis components

  • Read in the data (R).
  • Visualize it in a couple of ways (R): raw data with streets, choropleth of aggregated counts by street or polygon, kernel density of points.
  • Exploratory analysis (PySAL): global and local moran for aggregated counts.
  • Exploratory analysis (R): point patterns (tentative, not sure if fits with space constraints).
  • Confirmatory analysis (PySAL): regressionn of the number of deaths by street segment as a function to the distance to several pumps. Hopefully the "bad pump" will be the only significant one ;-)

Outline of the chapter

  • Introduction: what and why reproducibility; the concept of workflow and its role in reproducibility/open science; what this chapter is about (stress in the process more than the results).
  • Background on the application: John Snow's story and data.
  • Analysis.
  • Conclusion of the chapter.