Issue #60 closed
Jan Galkowski
created an issue

This is a top-level index and introduction to all government climate data. A scraper might wander off and duplicate a bunch we've done, e.g., via https://www.climate.gov/maps-data or https://www.climate.gov/maps-data/datasets. So I think the wget options would need to be tailored with care.

I welcome comments!

Comments (6)

  1. Greg Kochanski

    This download got terminated when the disk on my home machine crashed. Inspection shows 13232 files; it looks reasonably complete.

    However, I've restarted it from the Finnish machines to see if we can get more.

  2. Greg Kochanski

    Just terminated climate.gov on pub04.finland... I didn't get much useful info from a lot of extra downloads. Just more of an apparently infinite set of access for the same files.

    It all looked like this for the last day or two:

    https://toolkit.climate.gov/case-studies?f[0]=field_climate_stressor%3A18&f[1]=field_parent_topic%3A665&f[2]=field_parent_topic%3A116&f[3]=field_workflow_step%3A60

    294654 files, and there sure as heck aren't nearly 300,000 distinct hand-written case-studies! 16 Gigabytes.

    Beginning to hash.

  3. Log in to comment