Custom date header possible?

Issue #629 resolved
Marnus van Wyk created an issue

Morning

I've built a little outlook plugin to slowly (invisibly) spool email from mounted PST's in Outlook from over 450 mailboxes into mailpiler smtp when my users are on LAN. Dealing with about 800Gb of PST's. It will take about 4 weeks, and that is fine (10 years of email in 4 weeks = bonus), eventually it will all get there.

I'll put the code on bitbucket when i've solved the final problem below, it is pretty useful.

I decided to spool the outlook items into new smtpclient messages and sending them to mailpiler, and thus the problem is i end up with mail from 2010 archive files as sent in 2015. There is just no way to override the Date: header set by smtpclient, so my options are: custom SMTP client (which makes it untrusted in Outlook, thus it doesn't work well, or changing it to an EML export -> smbd share, which splits the solution into two different parts, and something else might go wrong and i don't know if a specific pst is fully archived.

My suggestion is simple, and i think it might be useful for others too.

Set an X-Received-Date, or X-ToPiler-Date, with the same format as Date:, if it is in the mail header, ignore the Date: header when archiving the email, use the other one.

This is thus removed message.Headers.Set("Date", aMail.ReceivedTime.ToString("ddd, dd MMM yyyy HH:mm:ss Z",CultureInfo.InvariantCulture)); //overwritten by smtpclient

done like this message.Headers.Set("X-Archive-Date", aMail.ReceivedTime.ToString("ddd, dd MMM yyyy HH:mm:ss Z", CultureInfo.InvariantCulture));

Here they say that Date: will be overwriiten

https://msdn.microsoft.com/en-us/library/system.net.mail.mailmessage.headers(v=vs.100).aspx

Any suggestions/thoughts?

As always, thank you very much for your continued work on mailpiler!

Comments (7)

  1. Marnus van Wyk reporter

    update, eml file saving - This is a total pain, it seems i'll have to go extremely wide and far to extract the mailitem properties, many which are null and then literally build a eml file from scratch, which results in non-original email archival as well.

    a custom date header would be much much appreciated

    X-Sender: bla@bla.com X-Receiver: alb@alb.com Date: 5 Nov 2015 10:07:05 +0200 X-Archive-Date: Tue, 30 Nov 2010 09:47:57 +0200 Message-ID: ACA59B3DCD84BF42A7A1981A314217BA01E1158E@ivolml01.ivolve.lan MIME-Version: 1.0 From: bla@bla.com To: alb@alb.com Subject: FW: iVolve - QL053159 - IVOLVE00165 - A-P - ThinkPad T410 Core i7 Content-Type: multipart/mixed;

  2. Janos SUTO repo owner

    Hello Marnus, piler can acquire emails via 2 methods: the piler daemon by smtp, and the pilerimport utility by pop3, imap, eml, ... The piler daemon expects to receive emails real time (so to speak), and it has a builtin check to override the Date: value if it's out of sync of the current time, ie. -7 ... +1 days. Unfortunately for you it messes with the dates in your situation (sending old emails via smtp).

    Another user sent me a patch long ago to use the date in the received lines (but I was so lazy, and even I forgot about it, shame on me), or probably it's simpler if we just loose the date check for only future dates, ie. override only bogus dates set ahead of >1 days.

    So I suggest to try this, and if it works you just saved some work for checking an unusual date. In order to achieve this edit src/parser.c, and around the 400th line you should find something similar:

                  /* allow -1 week ... +1 day drift in the parsed Date: value */
                  if(sdata->now - sdata->sent > 604800 || sdata->sent - sdata->now > 86400) sdata->sent = sdata->now;
    

    Change it to the following:

                  /* allow -1 week ... +1 day drift in the parsed Date: value */
                  if(sdata->sent - sdata->now > 86400) sdata->sent = sdata->now;
    

    Then recompile piler, and test it. Btw. this outlook plugin seems promising.

  3. Marnus van Wyk reporter

    I was just peeking in parser.c actually. I think i can manage the changes and i'll let you know, but you can mark this as solved anyway, it is probably very specific.

    Plugin is nearly done. 700 lines of VSTO hell. Here is the "comments" on top that gives some detail.

        //The Idea : Find a way to get all mounted active, used and abused PST's into Piler, while people are continuing their normal work, without them noticing anything at all.
    
        //The loop timer is very dependant on organisational size. Suggestions are -> 100 mailboxes -> 10 seconds, 200 -> 15, 300 -> 20, 400 -> 25
        //For 400 users for instance at 25 seconds per email per mail user, it means 16 emails A SECOND will be sent to the archiver smtp. Make sure it can handle that, or increase timer.
        //My calcs for about 700Gb of PST's says it will take 5 weeks. The PST's are not going anywhere, and eventually everything will be on there.
        //This runs in the background without the user noticing. There is a slight 2 second start up lag as the addin is switched on, but otherwise it is very hard to notice on my i5 laptop.
        //It will detect and safely disable itself when the machine is taken of net, it will only run if the network roundtrip time for the smtp receiver is less than MAXIMUM_ROUNDTRIP. If you have
        //huge pipes, but higher latency you definitely want to increase the time, but for us, i didn't want people on Mobile 3G $$$ expensive accounts archiving email at all when at home.
    
        //What it does : Finds all mounted PST's, waits TIMERINTERVAL, picks one, enumerates a folder/child, gets a count, selects the first email that is not categorised as "Archived" and sends
        //it to the smtp for piler, mark it, wait TIMERINTERVAL, filter, send, mark it, wait, filter (surprisingly, this is very quick!), send, mark it, wait, etc etc, when it gets to zero count on the restrict, 
        //the enumeration runs to the next folder, and the process restarts.
        //when it is done with all the folders in the pst, it then increments the pst counter by 1 and the whole thing restarts again with enumeration an filter.
        //If the outlook opens and closes, it will start at pst 0 again, and it will just filter filter filter and then move on to the next pst again.
        //How outlook decides the order of the PST's i have not figured out, it is never the same ordered list. It doesn't matter, as long as it archives whatever is in there until it is done.
        //It needs to go into every PST over and over again anyway to check that the user didn't add or move a email to any PST, and it needs to see any new surprise PST's.
    
        //There is no simple way to deal with embedded attachments on mapi emails, the available methods suck ass, and you HAVE TO save the attachment first, so they are then reattached
        //as normal attachments into the new mail message.
    
        //my take on current OST email is that they will eventually be archived into PST's and picked up, every mail will be in the piler since journalling plus a gap
        //that is between ost and pst, those will go into pst eventually (as mailbox fills up the user will move them) and, piler will reject duplicates, AND this plugin marks 
        //previously handled emails as well.
        //So the piler won't fall over dealing with duplicates over and over, and any new email in an existing PST will be detected, and your journalling anyway.
        //Thus, at some point you can disable the plugin very safely if you had journalling running long enough.
    
        //if a lot of the try catch blocks don't make sense, wait till you try it yourself, outlook interop is EXTREMELY crap.
        //For example you have to deal with blank to fields, to fields with "undisclosed-recipients;",to fields that are X400, to fields with X400 names that don't resolve any more
        //to fields on X400 addresses that don't have smtp addresses, X400 to fields from old discontinued domains. to fields with names, to fields with groups, to fields that are just null, etc etc...
        //This plugin does however tries extremely hard to get an answer close to original outlook item
    
        //since users tend to move mail everywhere they please, i run through ALL folders, including deleted items and drafts. Drafts are a special kind of hell so i default
        //them to the current profile logged in user smtp address (it should be his PST after all) of either or both the to and from fields are blank. (Yes it can happen)
    
        //To Do : 
        //Some users might have private email accounts pointing to their private PST's on their exchange configured outlook as additional accounts. 
        //Currently these will be picked up and archived (and we don't want that). Will add a restrict filter expansion that filters to DEFAULT_DOMAIN only (to and/or from) fields.
    
       // think about marking a PST that is fully archived as ARCHIVED, and maybe even mark it READ ONLY with a password? so we can safely ignore it, and make sure a user
       //can 't add more to it.
    

    Have a good day and a great weekend.

  4. Marnus van Wyk reporter

    I've modified a copy of the parser.c to include the following section above Date: parsing. My C++/C skills are "very scary" when modifying other people's code, so i just copied the Date: one and added my own date header above it with exactly the same code. It needs to work exactly the same way anyway (pretend it is "date:").

    Working fine to spool historic emails in as a previous date via SMTP!

    new_header1.png

    The C# code for the header looks like this

    message.Headers.Set("X-Piler-Original-Date", aMail.ReceivedTime.ToString("ddd, dd MMM yyyy HH:mm:ss zzz", CultureInfo.InvariantCulture));

    the new parser.c looks like this, will fix the spacing :( new_parser.c.PNG

    so if the new custom header does not exist, sdate->sent==0 and life goes on as usual, if that header exists, date is set, and then Date: else if date is skipped as already set.

    i did take your "drift" suggestion into account and removed the "too far in the past" test.

    It is a shame that .NET does not allow overriding the Date: header at all, but at least it is possible now to spool in email as any date using a specific header.

    Thank you for your time and valuable suggestions!

  5. Janos SUTO repo owner

    I think I'll fix the parser to drop the "too far in the past" test, and check if it's not newer (<2 days) than current time, and it should be a permanent solution, also allowing me to get rid of some related hacks needed.

  6. Log in to comment