1. James Murty
  2. jets3t
  3. Issues
Issue #69 resolved

Files not deleted during sync

Tim Sylvester
created an issue

JetS3t synchronize incorrectly concludes that a file is already synchronized when in fact it exists only on remote storage.

The following shell script demonstrates the issue:

{{{

!/bin/bash

arbitrary bucket name

bucket=n2nixwzkotyy mkdir files echo 123 > files/a jets3t/bin/synchronize.sh --properties jets3t.synchronize.properties UP $bucket files echo 234 > files/a jets3t/bin/synchronize.sh --properties jets3t.synchronize.properties UP $bucket files jets3t/bin/synchronize.sh --properties jets3t.synchronize.properties UP $bucket files rm files/a jets3t/bin/synchronize.sh --properties jets3t.synchronize.properties UP $bucket files rmdir files }}}

(The properties file contains only accesskey and secretkey)

This should upload a file, update it, verify it and do nothing, and then delete it.

{{{ UP Local [files] => S3[n2nizwzkotyy] N files/ N files/a New files: 2, Updated: 0, Reverted: 0, Deleted: 0, Unchanged: 0

UP Local [files] => S3[n2nizwzkotyy] - files/ U files/a New files: 0, Updated: 1, Reverted: 0, Deleted: 0, Unchanged: 2

UP Local [files] => S3[n2nizwzkotyy] - files/ - files/a New files: 0, Updated: 0, Reverted: 0, Deleted: 0, Unchanged: 2

UP Local [files] => S3[n2nizwzkotyy] - files/ New files: 0, Updated: 0, Reverted: 0, Deleted: 0, Unchanged: 2 }}}

As you can see, the last step decides that no action should be taken even though the previously synchronized file has been deleted on the local fileystem.

My investigation has led me to suspect the method org.jets3t.service.utils.FileComparer.buildDiscrepancyLists(), specifically:

{{{ ... for (String localPath: splitFilePathIntoDirPaths(keyPath, storageObject.isDirectoryPlaceholder())) { // Check whether local file is already on server if (filesMap.containsKey(localPath)) { // File has been backed up in the past, is it still up-to-date? File file = filesMap.get(localPath);

        if (file.isDirectory()) {
            alreadySynchronisedKeys.add(keyPath);
            ...

}}}

After splitting a file path into partial paths, this seems to match the partial path against a directory and therefore decide that the full path (which is a file not a directory) is already synchronized.

More concretely, "files/a" is split into "files" and "files/a", the local filesMap does contain "files" and the matching Java File object is indeed a directory, so "files/a" is added to the set of "already synchronized" keys.

If a file is multiple levels down in the directory hierarchy, this results in the file key being added to the "already synchronized" set multiple times, as each parent directly level is detected as a directory.

Comments (5)

  1. Log in to comment