Issue #39 wontfix
Jan Galkowski
created an issue

https://rawdata.oceanobservatories.org/files/

Copy to

/home/maxwell/local_data/jan_spillover/

Priority set because one of a kind resource, important for both climate research and national security (whether people realize it or not), and because, if the OOI is shut down, it is the basis for a case arguing why it should be reinstituted.

See also: http://oceanobservatories.org/

Look, 90% of the radiative forcing from GHGs goes into the oceans, and <strike>40%</strike> 30% of the excess CO2 in atmosphere. It's important to dig what goes on there.

Comments (12)

  1. Jan Galkowski reporter

    Using

    wget --dns-timeout=10 --connect-timeout=20 --read-timeout=120 --wait=5 --random-wait -e robots=off --prefer-family=IPv4  --tries=40 --timestamping=on --recursive --level=8 --no-remove-listing  --follow-ftp -nv --output-file=rawdata-oceanobservatories-org-files.log --no-check-certificate  https://rawdata.oceanobservatories.org/files/
    

    because robots.txt is restrictive.

  2. Jan Galkowski reporter

    So I have been downloading Ocean Observatories data for over a week. I noticed today that I was beginning to see:

    2017-01-20 02:46:29 ERROR 404: Not Found.
    https://rawdata.oceanobservatories.org/files/CE04OSBP/LJ01C/ADCPSI103_10.33.12.5_2101_20150224T2000_UTC.dat:
    2017-01-20 02:46:34 ERROR 404: Not Found.
    https://rawdata.oceanobservatories.org/files/CE04OSBP/LJ01C/ADCPSI103_10.33.12.5_2101_20150224T2100_UTC.dat:
    2017-01-20 02:46:41 ERROR 404: Not Found.
    https://rawdata.oceanobservatories.org/files/CE04OSBP/LJ01C/ADCPSI103_10.33.12.5_2101_20150224T2200_UTC.dat:
    2017-01-20 02:46:45 ERROR 404: Not Found.
    https://rawdata.oceanobservatories.org/files/CE04OSBP/LJ01C/ADCPSI103_10.33.12.5_2101_20150224T2300_UTC.dat:
    2017-01-20 02:46:48 ERROR 404: Not Found.
    https://rawdata.oceanobservatories.org/files/CE04OSBP/LJ01C/ADCPSI103_10.33.12.5_2101_20150225T0000_UTC.dat:
    2017-01-20 02:46:52 ERROR 404: Not Found.
    https://rawdata.oceanobservatories.org/files/CE04OSBP/LJ01C/ADCPSI103_10.33.12.5_2101_20150225T0100_UTC.dat:
    2017-01-20 02:47:00 ERROR 404: Not Found.
    https://rawdata.oceanobservatories.org/files/CE04OSBP/LJ01C/ADCPSI103_10.33.12.5_2101_20150225T0200_UTC.dat:
    2017-01-20 02:47:05 ERROR 404: Not Found.
    https://rawdata.oceanobservatories.org/files/CE04OSBP/LJ01C/ADCPSI103_10.33.12.5_2101_20150225T0300_UTC.dat:
    2017-01-20 02:47:09 ERROR 404: Not Found.
    https://rawdata.oceanobservatories.org/files/CE04OSBP/LJ01C/ADCPSI103_10.33.12.5_2101_20150225T0400_UTC.dat:
    2017-01-20 02:47:17 ERROR 404: Not Found.
    https://rawdata.oceanobservatories.org/files/CE04OSBP/LJ01C/ADCPSI103_10.33.12.5_2101_20150225T0500_UTC.dat:
    2017-01-20 02:47:24 ERROR 404: Not Found.
    https://rawdata.oceanobservatories.org/files/CE04OSBP/LJ01C/ADCPSI103_10.33.12.5_2101_20150225T0600_UTC.dat:
    2017-01-20 02:47:30 ERROR 404: Not Found.
    https://rawdata.oceanobservatories.org/files/CE04OSBP/LJ01C/ADCPSI103_10.33.12.5_2101_20150225T0700_UTC.dat:
    2017-01-20 02:47:37 ERROR 404: Not Found.
    

    Checking the master site I found:

    ANNOUNCEMENT: OOI Cyberinfrastructure offline for maintenance 1/19-1/22
    The OOI Cyberinfrastructure (CI) online interface will be shut down from Thurs., Jan. 19, 9:00 AM ET to Sun., Jan. 22, 6:00 PM ET while a system upgrade takes place. This shutdown will impact all OOI CI services and tools, including the OOI data portal, Alfresco, Confluence, Redmine, and development environments. WebEx and oceanobservatories.org will still be available.
    
    The OOI operations team is making arrangements to ensure that all data will be collected and backed up during the scheduled maintenance.
    
    We appreciate your patience and understanding while this update takes places.
    
    If you have any questions or concerns, please contact the OOI Help Desk.
    

    I was downloading using:

    wget --dns-timeout=10 --connect-timeout=20 --read-timeout=120 --wait=5 --random-wait -e robots=off --prefer-family=IPv4 --tries=40 --timestamping=on --mirror --recursive --level=8 --no-remove-listing --follow-ftp -nv --append-output=rawdata-oceanobservatories-org-files.log --no-check-certificate https://rawdata.oceanobservatories.org/files/
    

    and sending directly to pub04 /var/local/jan/rawdata.oceanobservatories.org.files. I guess I'll need to resume, and include a -N -c on the wget options.

  3. Jan Galkowski reporter
    • changed status to open

    Site is back up and a little reorganized. I thought it prudent to safe and storage what was done before it went down, and start up a high-speed httrack copy. Note only https://rawdata.oceanobservatories.org/files/ is being done.

  4. Jan Galkowski reporter

    Just found out these data https://rawdata.oceanobservatories.org/files/ are already mirrored at Rutgers by their owners at Rutgers for safekeeping. Description: http://oceanobservatories.org/data/raw-data/#q1

    We have:

    1904006781805   ./rawdata.oceanobservatories.org.files_o
    57301099        ./rawdata.oceanobservatories.org
    703226597149    ./rawdata.oceanobservatories.org.files
    

    Question: Do we toss? Or keep?

  5. Sakari Maaranen

    You are the data expert @Joos-gcv, and you say this is already mirrored, and too large for us. I say we toss this. It's more than double our capacity. It would only occupy our space without providing a valuable copy. Make sure this is all deleted, on all servers, not occupying our space. Leave it to others.

    Cc: @John Baez @Greg Kochanski

  6. Log in to comment