add displayName and CN to user email addresses for ldap/sso users

Issue #130 closed
datapharmer created an issue

In an exchange environment internal emails often show the To as the display name registered in Outlook instead of the full email address. eg "John Smith" instead of "johnsmith@domain.com"

In the case of SSO or LDAP authentication where a user's email addresses are automatically registered to them this means that messages to their display name do not appear in their search results (but do appear to an auditor). By adding the displayName and/or CN attribute from LDAP/Active Directory to the list of their email addresses they would be able to view these messages as expected.

Comments (16)

  1. Janos SUTO repo owner

    What about the journaling headers? They should contain email addresses (eg. Recipient: email@address) even for internal emails. Anyway the parser keeps only emails addresses, and omits the rest when building the recipient email list.

    To verify it please check the v_messages view in the piler database:

    select * from v_messages where piler_id='40000xxxxxx';

    where 400000xxxxxx is the piler id you can see in the maillogs, and check if johnsmith@domain.com is there.

  2. datapharmer reporter

    Ok, so I looked at these more closely and they appear to be garbage emails. Some weird artifacts from the import process (not sure if it is from processing the pst journals in readpst or in the pilerimport yet). The important information shows up for the user, but for the auditor they get some additional garbage messages too.

    They look like this: From: alerts@binarycanary.com alerts@binarycanary.com Subject: BinaryCanary.com PASS: The website has recovered successfully! To: User Name Date: Wed, 24 Jul 2013 10:57:57 +0000 Message-Id: ef394472-3379-470d-80d0-405069764697@journal.report.generator MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="--boundary-LibPST-iamunique-2024623150_-_-"

    Sender: alerts@binarycanary.com Subject: BinaryCanary.com PASS: The website has recovered successfully! Message-Id: 615716099.407953.1374663065453.JavaMail.cfuser@127.0.0.1 Recipient: user@domain.com

    ----boundary-LibPST-iamun

    And the database comes back empty when you search for it:

    mysql> select * from v_messages where piler_id = '4000000052153bd23a27b00c0099edffd9c3'; Empty set (0.02 sec)

    Maybe the best solution is just for me to create an archive rule for ----boundary-LibPST-iamun and call it a day! Thanks for all the help!

  3. datapharmer reporter

    On second thought while it seems obvious this is from readpst (LibPST doh!) I'm not sure how to create an archive rule based on the message body to filter these out...

  4. Janos SUTO repo owner

    --boundary-LibPST-* comes from libpst for all extracted EML files, so I don't think it's a good idea to discard emails having such boundaries.

    However regarding the previous topic (DN or journal recipients), please send an internal email that will have a similar To: field, and let's check it again.

    I'm not sure whether pst extracted files have the journal info any more.

  5. datapharmer reporter

    Yes, I checked earlier and the journal message that is extracted by readpst should contain the full email address in the headers, but for some reason a chunk of these messages are coming through with little or no content. I will look into the source of these mystery messages a little further next week and update you once I know more.

  6. datapharmer reporter

    ok, I've reindexed things and looked around some more to try and figure out what is happening. Here are my finding so far:

    1. Users who are not an auditor and use advanced search for to: (user's email address, name, exactly what is in header, etc) will get either no results or results that don't belong to them
    2. Users who get results that don't belong to them show a "no permissions to xxxxxxxxx" where that is the pilerid of the message
    3. viewing the headers and message as an auditor confirms the message is unrelated to the user and cannot be found by searching that user's messages. Search results "to" and header information all reflects the correct user.
    4. In all cases that I have found the odd messages contain the "--boundary-LibPST-xxxxxxx" as part of the message (xxxxxx is guid looking value).
    5. Searching as an auditor for LibPST (with inlen setup to find this) or a complete LibPST string yields no result

    Example headers of problem message:

    From: webmaster@localhost.domain.invalid webmaster@localhost.domain.invalid Subject: forgot your password for HP ePrintCenter? To: [REDACTED] Date: Mon, 25 Jul 2011 17:13:57 +0000 Message-Id: b42bc0fe-2fb0-4894-83cf-8102837c1c51@journal.report.generator MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="--boundary-LibPST-iamunique-1907283411_-_-"

    So, to summarize my questions about this: -If this is a problem with libpst creating garbage, how can I get rid of it! -Why does a non-auditor user see this garbage but not the expected results when searching using the "to" field?

    Other note: system is currently configured to use SSO.

  7. Janos SUTO repo owner

    Strange, indeed. First, let's check whether the given user has proper email addresses assigned by looking at the settings menu.

    Then take such a message that is present in the search result, but the user has no permissions on it. Run pilertest on it to see what piler thinks about the message. Check for To and From fields.

    Finally let this user enter the same search query again, and let's verify that the sphinx search query is built properly.

  8. datapharmer reporter
    1. Yes, the proper addresses are assigned to the user.

    2. Pilertest gave the following: message 1: message-id: b42bc0fe-2fb0-4894-83cf-8102837c1c51@journal.report.generator from: webmaster@localhost.domain.invalid webmaster localhost domain invalid webmaster@localhost.domain.invalid (localhost.domain.invalid) to: redactedlastname katy () reference: subject: forgot your password for HP ePrintCenter? body: *Sender Subject forgot your password for HP ePrintCenter Message-Id Recipient katyrXredacteddomain.com * sent: 1311628437, delivered-date: 0 hdr len: 365 body digest: 0190fbe4ca7830c8ac1e416c9acab2ca8a2992bf150645f12753d625e5b32b59 rules check: (null) retention period: 1605205416 attachments: direction: 0 spam: 0 NOT IN mydomains

    3. Confirmed that the search query can be replicated: from: localuser's name to: accountholder's name

  9. datapharmer reporter

    Nov 20 16:12:39 abcmp1 piler-webui[18392]: sphinx query: 'SELECT id FROM dailydelta1,main1,main2,main3,main4 WHERE MATCH('(@from [username-redacted]mX[domain-redacted]Xcom| [username-redacted]2X[domain-redacted]Xcom | @to [username-redacted]mX[domain-redacted]Xcom| [username-redacted]2X[domain-redacted]Xcom)') ORDER BY sent DESC LIMIT 0,1000 OPTION max_matches=1000' in 0.05 s, 1000 hits Nov 20 16:12:41 abcmp1 piler-webui[18392]: sphinx query: 'SELECT id FROM dailydelta1,main1,main2,main3,main4 WHERE MATCH('@from INVALID') ORDER BY sent DESC LIMIT 0,1000 OPTION max_matches=1000' in 0.00 s, 2 hits Nov 20 16:12:44 abcmp1 piler-webui[18392]: sphinx query: 'SELECT id FROM dailydelta1,main1,main2,main3,main4 WHERE MATCH('@from INVALID') ORDER BY sent DESC LIMIT 0,1000 OPTION max_matches=1000' in 0.00 s, 2 hits

  10. Janos SUTO repo owner

    Can you download the latest master branch, and setup a new virtualhost to try it? Don't upgrade the binaries, just copy the webui directory and make it available as a new test virtualhost to try it. Then let's see if this issue persists.

    Note that you should copy the $config['SPHINX_MAIN_INDEX'] from your current config*.php, since it has changed.

  11. Janos SUTO repo owner

    No news is good news. Let me know if you have the chance to try this with a recent build. Until then, case is closed.

  12. Log in to comment