Issue #30 closed
Jan Galkowski
created an issue

These data are accessible from several sites, and are of varying vintage, including a reprocessing of data needed to fix a bug in the ARGO data handling system.

Some of the sources I have may overlap, so I am depositing them in a single large "ARGO" directory.

The original sources are: (This an archive of I believe uncorrected data.)

I know of ARGO because of my interaction and support of Woods Hole Oceanographic Institution (WHOI), which helps maintain them as well as build and launch some, and collect some data.

Late addition

Comments (27)

  1. Jan Galkowski reporter

    These are being done using:

    wget --dns-timeout=10 --connect-timeout=20 --read-timeout=120 --wait=5 --random-wait --prefer-family=IPv4 --tries=40 --timestamping=on --recursive --level=8 --no-remove-listing --follow-ftp -nv --output-file=www-nodc-noaa-gov-argo.log --no-check-certificate

    wget --dns-timeout=10 --connect-timeout=20 --read-timeout=120 --wait=5 --random-wait --prefer-family=IPv4 --tries=40 --timestamping=on --recursive --level=8 --no-remove-listing --follow-ftp -nv --output-file=ftp-nodc-noaa-gov-nodc-archive.log --no-check-certificate http

    wget --dns-timeout=10 --connect-timeout=20 --read-timeout=120 --wait=5 --random-wait --prefer-family=IPv4 --tries=40 --timestamping=on --recursive --level=8 --no-remove-listing --follow-ftp -nv --output-file=www-nodc-noaa-gov-argo-accessData.log --no-check-certificate

    wget --dns-timeout=10 --connect-timeout=20 --read-timeout=120 --wait=5 --random-wait --prefer-family=IPv4 --tries=40 --timestamping=on --recursive --level=8 --no-remove-listing --follow-ftp -nv --output-file=ftp-nodc-noaa-gov-pub-data-nodc-argo.log --no-check-certificate

  2. Jan Galkowski reporter is completed, on azi02. is completed on azi02.

    lftp of on azi02 and of on azi02 still in progress. Plenty of room for now on azi02, /home/jan/local_data/. These are going to its ARGO subdirectory.

    No checksums yet generated. I'll wait for all the pieces to finish.

  3. Jan Galkowski reporter

    Progress continues slowly. Now have 2435Gb. That currently breaks out as:

    [jan@azi02 ARGO]$ find . -maxdepth 1 -type d -exec nice ionice du -s -b -c --apparent-size -BG {} \;
    2435G   .
    2435G   total
    253G    ./argo.accessData
    253G    total
    1G      ./data.nodc.argo
    1G      total
    1791G   ./nodc.archive
    1791G   total
    392G    ./
    392G    total
    [jan@azi02 ARGO]$
  4. Jan Galkowski reporter

    This has been organized to It's not clear what happened to the archives. I am attempting to get a sizing of the new section, now.

  5. Jan Galkowski reporter

    Permanent home for the ARGO data will be pub04 and 3 Tb has been allocated for the purpose. After the copy completes, I will rsync the data to /var/local/jan/ARGO/ on pub04, and then finish up the SHA sums and so on.

  6. Jan Galkowski reporter

    ARGO status:

    [jan@azi02 ARGO]$ sudo find . -maxdepth 1 -type d -exec nice ionice du -s -b -c --apparent-size -BG {} \;
    2278G   .
    2278G   total
    1803G   ./nodc.archive
    1803G   total
    393G    ./
    393G    total
    83G     ./phod-ARGO_FTP-argo
    83G     total
  7. Jan Galkowski reporter

    SHA sums done. It took a full day!

    [jan@azi02 ARGO]$ ls -lt
    total 3451584
    -rw-rw-r-- 1 jan jan 2072579652 Apr 14 02:00 ARGO.sha512.txt
    -rw-rw-r-- 1 jan jan 1461829572 Apr 13 17:20 ARGO.sha256.txt
    drwxrwxr-x 5 jan jan       4096 Apr 13 07:37 2017-04-13T0736
  8. Jan Galkowski reporter

    rsync completed:

    4/16/2017 12:29:07 AM
    [jan@pub04 ARGO]$ nice ionice find . -type f -print | wc -l
    [jan@azi02 ARGO]$  nice ionice find . -type f -print | wc -l
    [jan@azi02 ARGO]$
  9. Log in to comment