NOAA ESRL CT2015 Carbon Tracker Web site

Issue #13 resolved
Jan Galkowski
created an issue

https://www.esrl.noaa.gov/gmd/ccgg/carbontracker/

Replicate

https://www.esrl.noaa.gov/gmd/ccgg/carbontracker/

to

/media/jan-one/esrl.noaa.gov.gmd.ccgg.carbontracker-web

using

wget --dns-timeout=10 --connect-timeout=20 --read-timeout=120 --wait=12 --random-wait --prefer-family=IPv4  --tries=40 --timestamping=on --recursive --level=8 --no-remove-listing  --output-file=esrl-noaa-gmd-ccgg-carbontracker-gov-errors.log --no-check-certificate  http://www.esrl.noaa.gov/gmd/ccgg/carbontracker/

Comments (16)

  1. Jan Galkowski reporter

    slash_problem_2016-12-21_190252.jpgThis data has been transferred to /media/jan-one/esrl.noaa.gov.gmd.ccgg.carbontracker-web. There are some discrepancies, recorded in the logs. During the copy from the remote side, for example, there are a number of instances of:

    . 2016-12-17 16:02:42.490 Connecting to 140.172.200.31:53043 ...
    < 2016-12-17 16:02:42.490 550 Failed to open file.
    . 2016-12-17 16:02:42.490 Copying files from remote side failed.
    * 2016-12-17 16:02:42.490 (ExtException) **Copying files from remote side failed.**
    * 2016-12-17 16:02:42.490 Failed to open file.
    . 2016-12-17 16:02:42.492 Asking user:
    . 2016-12-17 16:02:42.492 Error transferring file '/data/aer/doc/presentations/blah.pdf'. ("Copying files from remote side failed.","Failed to open file.")
    

    These appear to all be *.odg files, and I do not know what's special about them.

    Then, on the copy from my local facilities to /media/jan-one/esrl.noaa.gov.gmd.ccgg.carbontracker-web, the logs report errors of this type:

    . 2016-12-21 11:46:07.854 Copying "W:\migrate2\esrl-noaa-gmd-ccgg-carbontracker-gov\www.esrl.noaa.gov\gmd\dv\data\index.html@parameter_name=O18%2FO16+in+Carbon+Dioxide&site=ZEP" to remote directory started.
    . 2016-12-21 11:46:07.855 Binary transfer mode selected.
    . 2016-12-21 11:46:07.855 Opening remote file.
    > 2016-12-21 11:46:07.855 Type: SSH_FXP_OPEN, Size: 171, Number: 38216707
    < 2016-12-21 11:46:07.970 Type: SSH_FXP_STATUS, Size: 29, Number: 38216707
    < 2016-12-21 11:46:07.970 Status code: 2, Message: 38216707, Server: No such file, Language:  
    > 2016-12-21 11:46:07.970 Type: SSH_FXP_LSTAT, Size: 155, Number: 38216967
    < 2016-12-21 11:46:08.083 Type: SSH_FXP_STATUS, Size: 29, Number: 38216967
    < 2016-12-21 11:46:08.083 Status code: 2, Message: 38216967, Server: No such file, Language:  
    * 2016-12-21 11:46:08.083 (ETerminal) No such file or directory.
    * 2016-12-21 11:46:08.083 Error code: 2
    * 2016-12-21 11:46:08.083 Error message from server: No such file
    * 2016-12-21 11:46:08.083 (EScpSkipFile) Cannot create remote file '/media/jan-one/esrl.noaa.gov.gmd.ccgg.carbontracker-web/www.esrl.noaa.gov/gmd/dv/data/index.html@parameter_name=O18/O16+in+Carbon+Dioxide&site=ZEP'.
    * 2016-12-21 11:46:08.083 No such file or directory.
    * 2016-12-21 11:46:08.083 Error code: 2
    * 2016-12-21 11:46:08.083 Error message from server: No such file
    . 2016-12-21 11:46:08.099 File: 'W:\migrate2\esrl-noaa-gmd-ccgg-carbontracker-gov\www.esrl.noaa.gov\gmd\dv\data\index.html@parameter_name=O18%2FO16+in+Carbon+Dioxide&type=Flask' [2016-12-19T02:38:39.309Z] [99187]
    

    The problem with the latter files is that the filename contains an embedded slash or, at least a "%2F", that is, a urelenocded slash.. Accordingly, because when WinSCP attempts to move the file, it somehow gets mapped to an unquoted slash, it appears the destination is a directory path which doesn't exist, and the copy fails.

    At the risk of being inconsistent with the source datasets, I am renaming all such files to replace the "%2F" with a hyphen.

    I have encountered something like this before in logs when the filename contained an embedded asterisk.

    We cannot be expected to fix these. These are horrible choices of filenames. They should standardize.

  2. Jan Galkowski reporter

    Ach! Tried to do a SHA256 recursively on the files in the Windows side of the world. No luck. Accordingly, doesn't make much sense to do it on the Linux side. I am copying off and saving the data I retrieved from the .gov site as long as I can, in case someone has a better idea here.

    Clearly, I can't keep doing this kind of thing indefinitely. I'll run out of space and time.

  3. John Baez

    Added to our wiki with these comments:

    Copied and placed by Jan at /media/jan-one/esrl.noaa.gov.gmd.ccgg.carbontracker-web "Ach! Tried to do a SHA256 recursively on the files in the Windows side of the world. No luck." "Completed, with discrepancies logged, and no SHA256 calculated."

  4. Jan Galkowski reporter

    Hi, John.

    Not exactly.

    There used to be one place for files, and it was called, after the project datarefuge.

    But since there has been a need for more and more storage, there are now several places, and I've tried to make the disposition of files as specific as possible, so there's documentation of where everything is.

    For example, presently:

    Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/vg_pool-lv_data_jan 2113646512 962625624 1043630324 48% /home/jan/local_data //u141128.your-storagebox.de/backup 5242880000 2032586970 3210293030 39% /media/datarefuge //u141408.your-storagebox.de/backup 2097152000 1089054227 1008097774 52% /media/jan-one //u141468.your-storagebox.de/backup 524288000 27259721 497028280 6% /media/borislav-one //u141477.your-storagebox.de/backup 2097152000 1021692071 1075459929 49% /media/maxwell-one

    You'll see datarefuge in the second line.

    • Jan
  5. Log in to comment