pilerimport: not informing about failed imports

Issue #63 resolved
Peter Molnar created an issue

I have a strange issue, and pilerimport failure is just a part of the issue. I have a tons of .pst files have to be extracted and imported. The weird part of the story came from the readpst and I am not sure you can help with this problem. When I extract from a specified pst file in most cases the email file containing Status: RO in the first or second line. When I want to import this file with pilerimport it claims processed, but in the maillog I see the following entry and I couldn't find the email: Jan 21 14:02:35 testpc pilerimport[14947]: 4000000050fd3c75250bc99cdf9d42be8a1e: found message_id:null(4) null=0

My questions: Is it possible to show by the pilerimport if there was a problem? Do you have any idea what went wrong in readpst? I know this is not your responsibility, but I haven't found anything on the net...

Comments (11)

  1. Janos SUTO repo owner

    Having a "Status: RO" in the 1st or 2nd line looks strange to me. Please check the mail header, and verify that it's a valid message, and it has correct header (with message-id, and the usual stuff, etc.) and body.

    pilerimport reports a 'null' message-id if there's none or it couldn't extract the info properly.

  2. Janos SUTO repo owner

    Anyway I can modify the imap part to print an error message, if the import process itself gives an error. I wonder whether to keep the downloaded message that failed. Perhaps it's a good idea, since a message like 'error importing INBOX-3267' gives very little clue what went wrong.

  3. Peter Molnar reporter

    I think I can narrow down the problem (or I think). If I export the Sent Items, they typically result the problem above. I tried not only with readpst, but with Save as... in Outlook, or other mail extractors with same result. I think these mails have no message-id. Do you have any idea how can I import these emails? I think they were ok, when somebody sent to other mailbox that is also imported, but when the email left our system, for example only sent to an outside recipient it is unable to import.

  4. Janos SUTO repo owner

    There's a piler.conf archive_emails_not_having_message_id. By default it's set to 0 meaning that without a message-id it won't archive it. However if you set it to 1, than piler assigns the piler_id as the message-id, so you can import them.

  5. Janos SUTO repo owner

    Err, I wanted to say there's a piler.conf variable called archive_emails_not_having_message_id ... :-)

  6. Peter Molnar reporter

    Yes, I think that was the problem. Only one question: If I import these particular emails and the system assigns generated message-id for them, it means they will duplicated when I import them again?

  7. Janos SUTO repo owner

    If you import a message without a message-id twice, then it will be imported (and stored and archived) twice, since there's no way to tell that it's the same. Perhaps the importing routine may check the sha256 digest of the whole message, and that way it could drop its own duplicates you made by importing the very same message more than once.

    However adding another index may increase the table size and slow down the insert query. If you really need this, then we'll find out something.

  8. Janos SUTO repo owner

    I modified piler to issue a warning, and keep the message it's imported (I mean not imported) via imap.

    Please check it, and reopen this issue if it doesn't meet your expectations.

  9. Log in to comment