Option to ignore duplicates in pilerimport
Hi, I'm using pilerimport to load a lot of really old e-mail backups from several old systems (pst, evolution mbox, thunderbird, evolution maildir, google mail), and, in the past, I migrated the mailbox.
So, now, when I use pilerimport for these backups, I get a lot of duplicates.
Can I have an option in pilerimport to ignore duplicates? Or, either, how can I delete the duplicates from storage?
Comments (8)
-
repo owner -
reporter -
assigned issue to
-
assigned issue to
-
reporter Hi,
The FAQ is not clear (for me) in this case, because it says it stores duplicates for e-mail without message-id, and I assume that it does for all.
I have the current situation (see attachment) in GUI. Are these real duplicates? Are consuming storage? Are all without message-id? How can I locate one to check?
-
reporter -
repo owner - changed status to resolved
The piler daemon requires a unique message-id, otherwise it discards the email as a duplicate. A duplicate email is discarded, not stored. Pilerimport on the other hand assumes that you want to import an email no matter what. even if it has no message-id. I hope it clarifies that.
-
reporter Crystal clear! thanks! But, which is the meaning of "duplicated message" (see image). Are they real/stored duplicated (with no message-id) or report-only (it tries but rejectect by duplication)?
-
repo owner Just as the name suggests: number of duplicated messages that hit the piler daemon so far. They are counters for statistics. Since piler deduplicates messages, and stores everything in 1 copy, duplicates are discarded.
-
reporter - changed status to closed
- Log in to comment
Pilerimport detects and ignores duplicates, just prints a warning that the email being imported is a duplicate. Problems usually arise when the email has no message-id. In that case when you rerun the import process, pilerimport archives it again, and assigns a bogus (internal) message-id to prevent the duplication detection to discard the email. So it's best to ensure before the import that all your emails have a unique message-id. If not, then make one.