Issues with duplicates when importing mailboxes.

Issue #1125 closed
HgoAlv created an issue

We've been facing a couple of issues that may be a design decision, but would be nice if we could have a workaround for them:

When importing pst files, emails already imported for another user (in a previous pst file) are not imported due to a duplication error (which looks like the correct behavior). The problem for us is that when a user leaves the company, we import all of the email for that user in a new folder (with the name of that user), and those duplicated emails are not referenced in any way in the new imported user's folder. Does it seem reasonable to add at least a reference to the duplitated and not imported emails in the folder for the new user? (for example, letting an email be in 2 folders at once).

Related with the previous issue, as we import the email of all the users leaving the company and we can't keep it forever, we delete old users email in a 4 year cycle. Given the fact that the metadata for deleted emails is kept in the metadata table, importing the mailbox for another user leaving the company with duplicated emails deleted as part of a previous user cleanup does not import those dups, thus, effectively not importing parts of the new mailbox.

Comments (9)

  1. Janos SUTO repo owner

    I need some more clarification. So you import emails for Alice. Then you import emails for Bob. And if Bob also has the same email as Alice, then it’s not imported because piler stores single instances only.

    However I don’t understand that when Bob leaves what emails do you import for him? Shouldn’t you export his emails instead? Anyway I’d like to see the headers for such missing email. Because if both Alice and Bob received the same message, then both of them should be in the To/Cc mail header fields. And if so, then Bob should be able to see that email even if it wasn’t imported from his folder. One exception might be if the email was sent to a mailing list. In this case you may export emails for this mailing list as well for Bob.

  2. HgoAlv reporter

    I need some more clarification. So you import emails for Alice. Then you import emails for Bob. And if Bob also has the same email
    as Alice, then it’s not imported because piler stores single instances only.

    However I don’t understand that when Bob leaves what emails do you import for him? Shouldn’t you export his emails instead?

    Ok so maybe we are not using this solution as a proper mail archiving tool. Let me explain in detail our workflow. Usually, every
    user keeps their email on a live mailbox as long as they are in the company. If the online mailbox grows past a certain limit we
    archive the older emails in piler, and everything is ok with this. But when a user leaves the company, we move all of the leve email to
    piler and keep it there of a period (usually 4 years). Past that period, we delete that user's email. The problem is that if say, user
    Bob left the company 5 years ago (and thus we've already deleted his email), and user Alice leaves the company today and we try to
    import her emails, then all the emails in common with Bob won't import as the are allready in the metadata table and marked as deleted.
    I hope this clarifies the issue. Again, maybe this is a problem on our side for not using piler as a proper archiving tool.

    Anyway I’d like to see the headers for such missing email. Because if both Alice and Bob received the same message, then both of
    them should be in the To/Cc mail header fields. And if so, then Bob should be able to see that email even if it wasn’t imported
    from his folder. One exception might be if the email was sent to a mailing list. In this case you may export emails for this mailing
    list as well for Bob.

    Ok. So we made the following to try to test this out: we sent a mail to two different recipients from an account. Then we imported that
    mail from each of those two mailboxes into piler. User's A email imported in Folder1 correctly. Tried to import that same email from
    user B mailbox into folder2 and did not import due to duplicate detection. Everything working as expected to this point. The problem
    is that if user A logs into piler sees the email as expected, but user B does not see that same email (but he is in the recipient
    list of the email). We can send the headers for those emails if you need them, but preferably not to an open mailing list.

  3. Janos SUTO repo owner

    Now I understand it. Your workflow is somewhat odd, and the problem is that piler looks perplexed when you try to import an already deleted email, because it indeed keeps the email history in the metadata table.

    If you want to keep using it this way, then you should actually delete any deleted=1 records from the metadata table, as well as from the rcpt table, and finally from the attachment table.

    The other option could be to continuously archive all necessary emails, and delete only what’s really need to be deleted.

  4. HgoAlv reporter

    Ok, I understand that, but there's one thing I still don't see, maybe I am wrong.
    Say user A leaves the company today and we import his mailbox which has email E1 (and another user B was CCd in that email, thus having a duplicate of it).
    Say the retention period is 4 years. Three and a half years later user B leaves the company and we import his mailbox, also with a retention of 4 years.
    Email E1 is not imported beacuse it is a duplicate, and 6 months later gets deleted because of the retention period for user A.
    Is this ok? Shouldn't the retention period be updated so it is again 4 years in the future?

  5. Janos SUTO repo owner

    it works just as you described. The retention value is calculated and recorded at archiving time, and it doesn’t change after that. However, you are free to manually update the retention value for any email you want to prevent an early removal. Normally a company uses a predefined retention policy for emails. I understand that your use case a bit different, so your best bet is to manually fix the retention when necessary.

  6. HgoAlv reporter

    Ok I understand that we are trying to use Piler in way which is not how it is designed to behave.We'll try to work arround our difficulties as you describe.

    Thanks.

  7. Log in to comment