University of Idaho Gridded Surface Meteorological Data (UofI METDATA)

Sakari Albert Maaranen created an issue

2016-12-17 Hetzner storage box in Germany

wget --wait=12 --random-wait --prefer-family=IPv4 –verbose=on \
 --dns-timeout=10 --connect-timeout=20 --read-timeout=120 \
 --tries=40 --timestamping=on --recursive --level=inf \
 --no-remove-listing –output-file=climate.nkn.uidaho.edu_METDATA.log \
 --follow-ftp --no-check-certificate \
 -H, \ \ \

  1. Sakari Albert Maaranen reporter

    Abatzoglou, John ( wrote 19 December 2016 at 22:09:

    Hello Sakari,

    Thank you for your assistance and support of climate science research and associated datasets. I have asked Dr. Katherine Hegewisch (cc’ed here) to see if she could look into checksums for these files. We do update these data daily, so it may be fine to constrain the focus to data prior to 2016.

    You have my permission to mirror these data.

    I am not as worries about my datasets as they are on servers hosted by US universities, rather than the federal gov’t. Our data have also been shared with the Google cloud through Earth Engine in the event that helps prioritize other datasets for the refuge efforts.

    Thanks for your efforts!


  2. Sakari Albert Maaranen reporter

    This backup script has already run several days and has still not ended. Probably something wrong with the wget usage. Perhaps we should focus only on the files they have hashed above.

  3. Sakari Albert Maaranen reporter

    Interrupted the download. The valuable data is probably there, but there has to be loads of unnecessary extra files. Someone who understands this data set should investigate and perhaps drop the excess. Keep at least the hashed files.

  4. Sakari Albert Maaranen reporter

    The following directories and files were created as a part of this job. @Joos-gcv must have gzipped the transfer log when the transfer was still in progress, so it may be partial.

    0 Dec 31 21:03
    0 Dec 31 21:25
    8740817 Jan  3 13:05 climate.nkn.uidaho.edu_METDATA.log.gz
    0 Dec 31 20:48
    0 Dec 21 12:01
    0 Dec 18 19:41
    0 Dec 20 23:37

    All the above directories and files should now be moved together, away from datarefuge. I have a disk usage command running and will report size as soon as it completes.

  5. Sakari Albert Maaranen reporter
    [sam@azi03 datarefuge]$ nice ionice du -s -c -b *
    8740817 climate.nkn.uidaho.edu_METDATA.log.gz
  6. Sakari Albert Maaranen reporter

    No worries. I wasn't expecting my initial download to take weeks... I started it before we had any azi## servers. Most likely, and again just a guess, the valuable parts had already been transferred and the process was churning through some dynamically generated pages. Anyway, need to inspect to make sure.

  7. Sakari Albert Maaranen reporter

    @Joos-gcv where have you moved the directory? I didn't ask you to move it. Doesn't matter as long as it is safe. We should keep it together with all directories mentioned here.

    I am copying the northwestknowledge directories to pub05:/var/local/sam/datarefuge/.

    Please make sure there is no overlapping work.

  8. Sakari Albert Maaranen reporter

    Also climate.nkn.uidaho.edu_METDATA.log.gz is now at pub05:/var/local/sam/datarefuge/

  9. Sakari Albert Maaranen reporter

    Nevermind it. Found it and moved it to pub05. Once the copying of *northwestknowledge* finishes, these data will then live all in the same place.

  10. Sakari Albert Maaranen reporter
    sent 1847595663018 bytes  received 2715273 bytes  60901471.06 bytes/sec
    total size is 1847351284846  speedup is 1.00

    The command was:

    [sam@pub05 ~]$ ionice -c 2 -n 5 rsync -avub /media/datarefuge/*northwestknowledge* /var/local/sam/datarefuge
  11. Sakari Albert Maaranen reporter
    • changed status to resolved
    • edited description

    Not sure if this is complete, but it is a lot.

    [sam@pub05 ~]$ du --apparent-size --summarize --total -BG /var/local/sam/*
    1G      /var/local/sam/
    1G      /var/local/sam/climate.nkn.uidaho.edu_METDATA.log.gz
    1G      /var/local/sam/
    1G      /var/local/sam/
    1G      /var/local/sam/
    1108G   /var/local/sam/
    613G    /var/local/sam/
    1721G   total
