Files (without extension) don't seem to be uploading to S3
Hi,
I just received an upload of a few GIGs of data (~50 files, 142MB each). These files all sit in my uploads folder, but none of them have been uploaded to S3.
The crontab looks like:
sudo incrontab -l
/home/ECMWF/home/ECMWF/uploads IN_CREATE,IN_DELETE,IN_CLOSE_WRITE /usr/local/bin/movetos3.sh /home/ECMWF/home/ECMWF/uploads $# telluslabs-ftp/ECMWF/uploads/ $%
The dir looks like:
ls -l
total 8579608
-rw-r--r-- 1 ECMWF ECMWF 123189720 Oct 31 11:17 D1D10310000103100011
-rw-r--r-- 1 ECMWF ECMWF 123189720 Oct 31 11:18 D1D10310000103103001
-rw-r--r-- 1 ECMWF ECMWF 142640760 Oct 31 11:18 D1D10310000103106001
-rw-r--r-- 1 ECMWF ECMWF 123189720 Oct 31 11:19 D1D10310000103109001
-rw-r--r-- 1 ECMWF ECMWF 142640760 Oct 31 11:19 D1D10310000103112001
-rw-r--r-- 1 ECMWF ECMWF 123189720 Oct 31 11:19 D1D10310000103115001
-rw-r--r-- 1 ECMWF ECMWF 142640760 Oct 31 11:20 D1D10310000103118001
-rw-r--r-- 1 ECMWF ECMWF 123189720 Oct 31 11:20 D1D10310000103121001
-rw-r--r-- 1 ECMWF ECMWF 142640760 Oct 31 11:21 D1D10310000110100001
-rw-r--r-- 1 ECMWF ECMWF 123189720 Oct 31 11:21 D1D10310000110103001
-rw-r--r-- 1 ECMWF ECMWF 142640760 Oct 31 11:21 D1D10310000110106001
-rw-r--r-- 1 ECMWF ECMWF 123189720 Oct 31 11:22 D1D10310000110109001
-rw-r--r-- 1 ECMWF ECMWF 142640760 Oct 31 11:22 D1D10310000110112001
-rw-r--r-- 1 ECMWF ECMWF 123189720 Oct 31 11:23 D1D10310000110115001
-rw-r--r-- 1 ECMWF ECMWF 142640760 Oct 31 11:23 D1D10310000110118001
-rw-r--r-- 1 ECMWF ECMWF 123189720 Oct 31 11:23 D1D10310000110121001
-rw-r--r-- 1 ECMWF ECMWF 142640760 Oct 31 11:27 D1D10310000110200001
-rw-r--r-- 1 ECMWF ECMWF 123189720 Oct 31 11:27 D1D10310000110203001
...
Looking at the log file, I get the following:
The user-provided path /home/ECMWF/home/ECMWF/uploads/D1D10310000103100011.tmp does not exist.
2017-10-31 11:17:56 - Failed to move file /home/ECMWF/home/ECMWF/uploads/D1D10310000103100011.tmp to s3
2017-10-31 11:17:58 - Received event IN_CREATE on file system object /home/ECMWF/home/ECMWF/uploads/D1D10310000103103001.tmp
2017-10-31 11:18:17 - Received event IN_CLOSE_WRITE on file system object /home/ECMWF/home/ECMWF/uploads/D1D10310000103103001.tmp
2017-10-31 11:18:17 - Moving file /home/ECMWF/home/ECMWF/uploads/D1D10310000103103001.tmp to s3...
The user-provided path /home/ECMWF/home/ECMWF/uploads/D1D10310000103103001.tmp does not exist.
2017-10-31 11:18:17 - Failed to move file /home/ECMWF/home/ECMWF/uploads/D1D10310000103103001.tmp to s3
2017-10-31 11:18:20 - Received event IN_CREATE on file system object /home/ECMWF/home/ECMWF/uploads/D1D10310000103106001.tmp
2017-10-31 11:18:42 - Received event IN_CLOSE_WRITE on file system object /home/ECMWF/home/ECMWF/uploads/D1D10310000103106001.tmp
2017-10-31 11:18:42 - Moving file /home/ECMWF/home/ECMWF/uploads/D1D10310000103106001.tmp to s3...
The user-provided path /home/ECMWF/home/ECMWF/uploads/D1D10310000103106001.tmp does not exist.
2017-10-31 11:18:42 - Failed to move file /home/ECMWF/home/ECMWF/uploads/D1D10310000103106001.tmp to s3
2017-10-31 11:18:44 - Received event IN_CREATE on file system object /home/ECMWF/home/ECMWF/uploads/D1D10310000103109001.tmp
2017-10-31 11:19:04 - Received event IN_CLOSE_WRITE on file system object /home/ECMWF/home/ECMWF/uploads/D1D10310000103109001.tmp
2017-10-31 11:19:04 - Moving file /home/ECMWF/home/ECMWF/uploads/D1D10310000103109001.tmp to s3...
The user-provided path /home/ECMWF/home/ECMWF/uploads/D1D10310000103109001.tmp does not exist.
2017-10-31 11:19:04 - Failed to move file /home/ECMWF/home/ECMWF/uploads/D1D10310000103109001.tmp to s3
2017-10-31 11:19:07 - Received event IN_CREATE on file system object /home/ECMWF/home/ECMWF/uploads/D1D10310000103112001.tmp
2017-10-31 11:19:30 - Received event IN_CLOSE_WRITE on file system object /home/ECMWF/home/ECMWF/uploads/D1D10310000103112001.tmp
2017-10-31 11:19:30 - Moving file /home/ECMWF/home/ECMWF/uploads/D1D10310000103112001.tmp to s3...
The user-provided path /home/ECMWF/home/ECMWF/uploads/D1D10310000103112001.tmp does not exist.
2017-10-31 11:19:30 - Failed to move file /home/ECMWF/home/ECMWF/uploads/D1D10310000103112001.tmp to s3
2017-10-31 11:19:33 - Received event IN_CREATE on file system object /home/ECMWF/home/ECMWF/uploads/D1D10310000103115001.tmp
2017-10-31 11:19:53 - Received event IN_CLOSE_WRITE on file system object /home/ECMWF/home/ECMWF/uploads/D1D10310000103115001.tmp
2017-10-31 11:19:53 - Moving file /home/ECMWF/home/ECMWF/uploads/D1D10310000103115001.tmp to s3...
The user-provided path /home/ECMWF/home/ECMWF/uploads/D1D10310000103115001.tmp does not exist.
2017-10-31 11:19:53 - Failed to move file /home/ECMWF/home/ECMWF/uploads/D1D10310000103115001.tmp to s3
2017-10-31 11:19:55 - Received event IN_CREATE on file system object /home/ECMWF/home/ECMWF/uploads/D1D10310000103118001.tmp
2017-10-31 11:20:18 - Received event IN_CLOSE_WRITE on file system object /home/ECMWF/home/ECMWF/uploads/D1D10310000103118001.tmp
2017-10-31 11:20:18 - Moving file /home/ECMWF/home/ECMWF/uploads/D1D10310000103118001.tmp to s3...
...
Comments (10)
-
-
reporter Hi Robert,
No, unfortunately, I don't know (or have control of) the SFTP client being used...
Is there a way to kick off the upload of everything on the folder? I tried moving all the files to a tmp directory, then moving them back - but I got the same errors back (this is why I wonder if it is a property of the SFTP client...)
-
Can you try doing the following?
cd /home/ECMWF/home/ECMWF/uploads/ find . -type f -exec touch {} \;
This should
touch
all the files in theuploads/
directory. Sometimes, touch is all that's necessary to trigger a file to upload. (these files are kind of large, so you can either wait, ortail
the/var/log/movetos3/movetos3.log
file to monitor progress.If that doesn't work, let me know and I can try to troubleshoot further.
(A note about moving the files to a tmp directory: the 'mv' command doesn't trigger the IN_CLOSE_WRITE event that we're looking for. Something has to be written to the file, and in many cases,
touch
will work) -
reporter OK - this seems to have triggered the move... Now it seems that I am having memory issues (I am at t2.small instance). Will increase instance memory and retry
-
reporter So, increasing the instance size definitely helped.
What I am concerned about is how to prevent this going forward... I will be having these payloads dumped in daily. They are pretty large and a timely move to S3 is essential for us.
Any suggestions?
-
Also, with the file sizes you're working with, you might need to increase your volume size eventually. Here's a wiki page that explains how to do this: https://bitbucket.org/thorntechnologies/sftpgateway-public/wiki/Resizing%20an%20EC2%20Instance%20Volume
-
As for preventing this issue going forward...
You could have a cron job that touches all the files.
sudo crontab -e * * * * * cd /home/ECMWF/home/ECMWF/uploads && find . -type f ! -name *.tmp ! -name *.filepart -exec touch {} \;
This runs every minute, and touches all the files in the ECMWF user's uploads directory.
The thing to be careful about -- if the user is in the middle of uploading a large file with the resume/transfer feature, you want to avoid running
touch
on any.tmp
or.filepart
file (which will prematurely upload it to S3). So thefind
command excludes these file extensions.On our roadmap, we're trying to figure out a way for SFTP Gateway to work with SFTP clients with the resume/transfer feature. I can notify you once this feature is available. But in the meantime, try using the cron job and let me know if you run into any issues.
Thanks!
-
reporter Thank you! I will try your proposed solution
-
Just as an fyi, you might want to add
-mtime +1
(won't touch anything modified within the past day), just in case someone's in the middle of uploading a file. And then just run it once at midnight.sudo crontab -e 0 0 * * * cd /home/ECMWF/home/ECMWF/uploads && find . -mtime +1 -type f ! -name *.tmp ! -name *.filepart -exec touch {} \;
-
- changed status to resolved
- Log in to comment
Hi Ophir,
The SFTP client is probably using some kind of resume/transfer feature where it streams bits into a temp file. This feature interferes with how SFTP Gateway operates, since it's renaming the file with an extension (usually
.tmp
or.filepart
).Do you happen to know what SFTP client is being used? If so, I can try to figure out if there's a way to disable the resume/transfer setting.
Thanks!
Robert