University of Idaho Gridded Surface Meteorological Data (UofI METDATA)

Issue #1 resolved
Sakari Maaranen
created an issue

2016-12-17 Hetzner storage box in Germany

wget --wait=12 --random-wait --prefer-family=IPv4 –verbose=on \
 --dns-timeout=10 --connect-timeout=20 --read-timeout=120 \
 --tries=40 --timestamping=on --recursive --level=inf \
 --no-remove-listing –output-file=climate.nkn.uidaho.edu_METDATA.log \
 --follow-ftp --no-check-certificate \
 -H -Dclimate.nkn.uidaho.edu,northwestknowledge.net \
 http://climate.nkn.uidaho.edu/METDATA/ \
 https://www.northwestknowledge.net/metdata/data/ \
 http://thredds.northwestknowledge.net:8080/thredds/reacch_climate_MET_catalog.html

Comments (22)

  1. Sakari Maaranen reporter

    Abatzoglou, John (jabatzoglou@uidaho.edu) jabatzoglou@uidaho.edu wrote 19 December 2016 at 22:09:

    Hello Sakari,

    Thank you for your assistance and support of climate science research and associated datasets. I have asked Dr. Katherine Hegewisch (cc’ed here) to see if she could look into checksums for these files. We do update these data daily, so it may be fine to constrain the focus to data prior to 2016.

    You have my permission to mirror these data.

    I am not as worries about my datasets as they are on servers hosted by US universities, rather than the federal gov’t. Our data have also been shared with the Google cloud through Earth Engine in the event that helps prioritize other datasets for the refuge efforts.

    Thanks for your efforts!

    John

  2. Jan Galkowski
    Return-Path: <khegewisch@uidaho.edu>
    Received: from compute2.internal (compute2.nyi.internal [10.202.2.42])
         by sloti36d1t07 (Cyrus fastmail-fmjessie42865-14317-git-fastmail-14317) with LMTPA;
         Thu, 22 Dec 2016 01:22:14 -0500
    X-Cyrus-Session-Id: sloti36d1t07-2991063-1482387734-3-3533700243087821125
    X-Sieve: CMU Sieve 2.4
    X-Spam-known-sender: no
    X-Spam-score: 0.0
    X-Spam-hits: BAYES_00 -1.9, HTML_MESSAGE 0.001, RCVD_IN_DNSWL_NONE -0.0001,
      RCVD_IN_MSPIKE_H3 -0.01, RCVD_IN_MSPIKE_WL -0.01, RCVD_IN_UNSUBSCOREBL 1,
      SPF_HELO_PASS -0.001, SPF_PASS -0.001, LANGUAGES en, BAYES_USED user,
      SA_VERSION 3.4.0
    X-Spam-source: IP='104.47.34.53',
      Host='mail-by2nam01on0053.outbound.protection.outlook.com', Country='US',
      FromHeader='edu', MailFrom='edu', XOriginatingCountry='GB'
    X-Spam-charsets: plain='iso-8859-1', html='iso-8859-1'
    X-Resolved-to: disneylogic@fastmail.fm
    X-Delivered-to: jan@westwood-statistical-studios.org
    X-Mail-from: khegewisch@uidaho.edu
    Received: from mx1 ([10.202.2.200])
      by compute2.internal (LMTPProxy); Thu, 22 Dec 2016 01:22:14 -0500
    Received: from mx1.messagingengine.com (localhost [127.0.0.1])
        by mailmx.nyi.internal (Postfix) with ESMTP id 09DA82CAB6
        for <jan@westwood-statistical-studios.org>; Thu, 22 Dec 2016 01:22:13 -0500 (EST)
    Received: from mx1.messagingengine.com (localhost [127.0.0.1])
        by mx1.messagingengine.com (Authentication Milter) with ESMTP
        id B71206BC50B;
        Thu, 22 Dec 2016 01:22:13 -0500
    Authentication-Results: mx1.messagingengine.com;
        dkim=none (no signatures found);
        dmarc=none (p=none) header.from=uidaho.edu;
        spf=pass smtp.mailfrom=khegewisch@uidaho.edu smtp.helo=NAM01-BY2-obe.outbound.protection.outlook.com
    Received-SPF: pass
        (uidaho.edu: Sender is authorized to use 'khegewisch@uidaho.edu' in 'mfrom' identity (mechanism 'include:spf.protection.outlook.com' matched))
        receiver=mx1.messagingengine.com;
        identity=mailfrom;
        envelope-from="khegewisch@uidaho.edu";
        helo=NAM01-BY2-obe.outbound.protection.outlook.com;
        client-ip=104.47.34.53
    Received: from NAM01-BY2-obe.outbound.protection.outlook.com (mail-by2nam01on0053.outbound.protection.outlook.com [104.47.34.53])
        (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits))
        (No client certificate requested)
        by mx1.messagingengine.com (Postfix) with ESMTPS
        for <jan@westwood-statistical-studios.org>; Thu, 22 Dec 2016 01:22:12 -0500 (EST)
    Received: from CY4PR04MB0569.namprd04.prod.outlook.com (10.173.190.142) by
     CY1PR0401MB1369.namprd04.prod.outlook.com (10.161.213.25) with Microsoft SMTP
     Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id
     15.1.789.14; Thu, 22 Dec 2016 06:22:07 +0000
    Received: from CY4PR04MB0569.namprd04.prod.outlook.com ([10.173.190.142]) by
     CY4PR04MB0569.namprd04.prod.outlook.com ([10.173.190.142]) with mapi id
     15.01.0789.018; Thu, 22 Dec 2016 06:22:05 +0000
    From: "Hegewisch, Katherine (khegewisch@uidaho.edu)" <khegewisch@uidaho.edu>
    To: "Abatzoglou, John (jabatzoglou@uidaho.edu)" <jabatzoglou@uidaho.edu>,
        "Sakari A. Maaranen" <sakari.maaranen@gmail.com>
    CC: John Baez <baez@math.ucr.edu>, "jan@westwood-statistical-studios.org"
        <jan@westwood-statistical-studios.org>, Scott Maxwell
        <marsroverdriver@gmail.com>, "dave.tanzer@gmail.com" <dave.tanzer@gmail.com>
    Subject: Re: Backup mirror of UofI METDATA
    Thread-Topic: Backup mirror of UofI METDATA
    Thread-Index: AQHSWNr1Aow0CJrfZ0aeEd7Npyn3VqEPtc8AgANVlhc=
    Date: Thu, 22 Dec 2016 06:22:05 +0000
    Message-ID: <CY4PR04MB05691E83796E3C0A65DAD231BC930@CY4PR04MB0569.namprd04.prod.outlook.com>
    References: <1482030036.9167.25.camel@gmail.com>,<298DEF42-A4A2-45FD-9FE0-408E8A86FCC4@uidaho.edu>
    In-Reply-To: <298DEF42-A4A2-45FD-9FE0-408E8A86FCC4@uidaho.edu>
    Accept-Language: en-US
    Content-Language: en-US
    X-MS-Has-Attach:
    X-MS-TNEF-Correlator:
    authentication-results: spf=none (sender IP is )
     smtp.mailfrom=khegewisch@uidaho.edu; 
    x-originating-ip: [25.168.162.4]
    x-microsoft-exchange-diagnostics: 1;CY1PR0401MB1369;7:h83nAI8TE+9j/yqYL3lTQ2IUYcfXKIAN0ucvSMaA0/syNAeWs5LF+ib0ZC0BFYFNNY4YhVNkKLx0zXmBTTUGAlSWoqCPux9Zw0BGCJBIPfWv4Lmcy2kMgxYBTqCxHdJU7EpDBUFfKbRZrymzy70nU+ydHVir1ldPKjnrIBJF22czW0AMqU2rZl7wy7ra9sg52ZpdsKsP7PCixYPYLwuxO0jZiwpqoLzRe9NjEL1XTa+OTYuDH3qpsvaTpv4LfDBJzyIecDfh6bC0/mGiVLO7eNSzKazG1TM6WXaYfSrV2hMzOBtee/cxpKsxdcm4Xuf2q+nD+aW1WBaCz03JAKZ1m7V/O2uwaJqkOJV4JrbtU7CCx/fh+HeQfisNQFe03KHrHLW4B2H+pq9yCsmCIAFa7HpZLgZp9XaabnYx+ME+sgZqM2Wyrc9p6bgCth1jXHAWYOkYDtaOM2VFpWWWvqSECw==
    x-forefront-antispam-report: SFV:SKI;SCL:-1SFV:NSPM;SFS:(10009020)(7916002)(39450400003)(377454003)(24454002)(51884002)(189002)(199003)(229853002)(189998001)(7696004)(86362001)(90282001)(4326007)(50986999)(5660300001)(3846002)(3660700001)(66066001)(3280700002)(101416001)(6606003)(76176999)(54356999)(122556002)(345774005)(6116002)(102836003)(97736004)(2950100002)(5001770100001)(606005)(6436002)(6506006)(25786008)(77096006)(88552002)(2906002)(105586002)(76576001)(75432002)(7736002)(68736007)(8936002)(7906003)(74316002)(106116001)(38730400001)(99286002)(106356001)(19627405001)(2900100001)(89122001)(39060400001)(2171001)(92566002)(8676002)(81166006)(9686002)(33656002)(81156014);DIR:OUT;SFP:1101;SCL:1;SRVR:CY1PR0401MB1369;H:CY4PR04MB0569.namprd04.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en;
    x-ms-office365-filtering-correlation-id: 145258ba-39dd-4752-34a9-08d42a32d9c3
    x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001);SRVR:CY1PR0401MB1369;
    x-microsoft-antispam-prvs: <CY1PR0401MB13691E5ADCF88AE5A92A7D53BC920@CY1PR0401MB1369.namprd04.prod.outlook.com>
    x-exchange-antispam-report-test: UriScan:(209352067349851)(47647156867600);
    x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(6040375)(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001)(6041248)(20161123564025)(20161123560025)(20161123555025)(20161123562025)(6072148);SRVR:CY1PR0401MB1369;BCL:0;PCL:0;RULEID:;SRVR:CY1PR0401MB1369;
    x-forefront-prvs: 01644DCF4A
    received-spf: None (protection.outlook.com: uidaho.edu does not designate
     permitted sender hosts)
    spamdiagnosticoutput: 1:99
    spamdiagnosticmetadata: NSPM
    Content-Type: multipart/alternative;
        boundary="_000_CY4PR04MB05691E83796E3C0A65DAD231BC930CY4PR04MB0569namp_"
    MIME-Version: 1.0
    X-OriginatorOrg: uidaho.edu
    X-MS-Exchange-CrossTenant-originalarrivaltime: 22 Dec 2016 06:22:05.5372
     (UTC)
    X-MS-Exchange-CrossTenant-fromentityheader: Hosted
    X-MS-Exchange-CrossTenant-id: 7ebc6b63-5792-4a19-b20b-04b826048853
    X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR0401MB1369
    
    --_000_CY4PR04MB05691E83796E3C0A65DAD231BC930CY4PR04MB0569namp_
    Content-Type: text/plain; charset="iso-8859-1"
    Content-Transfer-Encoding: quoted-printable
    
    Sakari and others,
    
    
    I have made the checksums for the UofI METDATA/gridMET files (1979-2015) as=
     both md5sums and sha256sums.
    
    
    You can find these hash files here:
    
    https://www.northwestknowledge.net/metdata/data/hash.md5
    
    https://www.northwestknowledge.net/metdata/data/hash.sha256
    
    
    After you download the files, you can check the sums with:
    
    md5sum -c hash.md5
    
    sha256sum -c hash.sha256
    
    
    Please let me know if something is not ideal and we'll fix it!
    
    Thanks for suggesting we do this!
    
    
    Katherine
    
    ________________________________
    From: Abatzoglou, John (jabatzoglou@uidaho.edu)
    Sent: Monday, December 19, 2016 12:09:11 PM
    To: Sakari A. Maaranen; Hegewisch, Katherine (khegewisch@uidaho.edu)
    Cc: John Baez; jan@westwood-statistical-studios.org; Scott Maxwell; dave.ta=
    nzer@gmail.com
    Subject: Re: Backup mirror of UofI METDATA
    
    Hello Sakari,
    
    Thank you for your assistance and support of climate science research and a=
    ssociated datasets.  I have asked Dr. Katherine Hegewisch (cc'ed here) to s=
    ee if she could look into checksums for these files.  We do update these da=
    ta daily, so it may be fine to constrain the focus to data prior to 2016.
    
    You have my permission to mirror these data.
    
    I am not as worries about my datasets as they are on servers hosted by US u=
    niversities, rather than the federal gov't. Our data have also been shared =
    with the Google cloud through Earth Engine in the event that helps prioriti=
    ze other datasets for the refuge efforts.
    
    Thanks for your efforts!
    
    John
    
    
    On Dec 17, 2016, at 7:00 PM, Sakari A. Maaranen <sakari.maaranen@gmail.com<=
    mailto:sakari.maaranen@gmail.com>> wrote:
    
    Dear John Abatzoglou,
    
  3. Sakari Maaranen reporter

    This backup script has already run several days and has still not ended. Probably something wrong with the wget usage. Perhaps we should focus only on the files they have hashed above.

  4. Sakari Maaranen reporter

    Interrupted the download. The valuable data is probably there, but there has to be loads of unnecessary extra files. Someone who understands this data set should investigate and perhaps drop the excess. Keep at least the hashed files.

  5. Sakari Maaranen reporter

    The following directories and files were created as a part of this job. @Jan Galkowski must have gzipped the transfer log when the transfer was still in progress, so it may be partial.

    0 Dec 31 21:03 climate.nkn.uidaho.edu
    0 Dec 31 21:25 climate.northwestknowledge.net
    8740817 Jan  3 13:05 climate.nkn.uidaho.edu_METDATA.log.gz
    0 Dec 31 20:48 maca.northwestknowledge.net
    0 Dec 21 12:01 metdata.northwestknowledge.net
    0 Dec 18 19:41 thredds.northwestknowledge.net:8080
    0 Dec 20 23:37 www.northwestknowledge.net
    

    All the above directories and files should now be moved together, away from datarefuge. I have a disk usage command running and will report size as soon as it completes.

  6. Sakari Maaranen reporter
    [sam@azi03 datarefuge]$ nice ionice du -s -c -b *
    4207353 climate.nkn.uidaho.edu
    8740817 climate.nkn.uidaho.edu_METDATA.log.gz
    48418317        climate.northwestknowledge.net
    12541147        maca.northwestknowledge.net
    467007  metdata.northwestknowledge.net
    1189464254871   thredds.northwestknowledge.net:8080
    657825603504    www.northwestknowledge.net
    
  7. Jan Galkowski

    I am sorry about that. I was trying to ascertain which should be moved and was trying to make things as compact as possible. I had no idea there were active processes still accessing /media/datarefuge, as I thought such practice was deprecated. I should have checked with lsof. Apologies.

  8. Sakari Maaranen reporter

    No worries. I wasn't expecting my initial download to take weeks... I started it before we had any azi## servers. Most likely, and again just a guess, the valuable parts had already been transferred and the process was churning through some dynamically generated pages. Anyway, need to inspect to make sure.

  9. Sakari Maaranen reporter

    @Jan Galkowski where have you moved the climate.nkn.uidaho.edu directory? I didn't ask you to move it. Doesn't matter as long as it is safe. We should keep it together with all directories mentioned here.

    I am copying the northwestknowledge directories to pub05:/var/local/sam/datarefuge/.

    Please make sure there is no overlapping work.

  10. Sakari Maaranen reporter
    sent 1847595663018 bytes  received 2715273 bytes  60901471.06 bytes/sec
    total size is 1847351284846  speedup is 1.00
    

    The command was:

    [sam@pub05 ~]$ ionice -c 2 -n 5 rsync -avub /media/datarefuge/*northwestknowledge* /var/local/sam/datarefuge
    
  11. Sakari Maaranen reporter
    • changed status to resolved
    • edited description

    Not sure if this is complete, but it is a lot.

    [sam@pub05 ~]$ du --apparent-size --summarize --total -BG /var/local/sam/*
    1G      /var/local/sam/climate.nkn.uidaho.edu
    1G      /var/local/sam/climate.nkn.uidaho.edu_METDATA.log.gz
    1G      /var/local/sam/climate.northwestknowledge.net
    1G      /var/local/sam/maca.northwestknowledge.net
    1G      /var/local/sam/metdata.northwestknowledge.net
    1108G   /var/local/sam/thredds.northwestknowledge.net:8080
    613G    /var/local/sam/www.northwestknowledge.net
    1721G   total
    
  12. Log in to comment