GDPR contradiction? ENABLE_DELETE vs. GDPR right to erasure

Issue #996 resolved
Waldemar Hamm created an issue

Your GDPR related notes ( http://www.mailpiler.org/wiki/current:gdpr-related-notes ) state correctly, that there may come a time when you need to “remove a message if it contains sensitive personal data” and for this you have to turn on the “delete feature”.

By “delete feature” I assume you mean ENABLE_DELETE. If you turn this on you get this warning, though: https://bitbucket.org/jsuto/piler/src/505e5cb5686dafe83fd66b7d9fc0e5f8026e1de0/webui/language/en/messages.php?at=master#lines-485

Now this is really confusing. If I leave a mail archived that shouldn’t be archived because someone who sent me a mail made a mistake and then rightfully asks to delete the mail completely (or change retention rules of that single mail) I will have a GDPR non-compliant mail archive. But by the logic of piler I will also have a non-compliant mail archive as soon as I want to use the “delete feature” to actually achieve compliance with GDPR.

This is a contradiction that should be solved, or am I missing something here?

I hit this problem when thinking about what to do if mails that should stay archived for less than the default retention rule dictates are sent to a wrong mail address that has longer lasting retention than allowed for the kind of data that got transmitted. This hits an extreme case when you get mails that aren’t even allowed to be archived at all, because someone messed up their To: address.

Maybe there’s a workaround that I’ve missed?

Comments (22)

  1. Waldemar Hamm reporter

    @eXtremeSHOK Maybe you also have a say in this, since you helped shaping the ENABLE_DELETE feature and are also located in Germany (if I’m not mistaken from our encounter on Mailcow).

  2. Janos SUTO repo owner

    Well, you are right there’re conflicting interests. On one hand a company would like to archive all emails touching the mail server (probably assuming that there are only company emails). On the other hand an employee wants to get rid of his personal email ‘accidentally’ landed in the company mailbox.

    Such situation could be handled by an auditor. An employee asks him to delete this specific email, and he does so. However, I believe that such practice (even though it may be necessary) opens the door for removing just any email from the archive. The message you referred to meant to warn about it. If you think it’s inaccurate in your situation, you may replace it with an empty string.

    Anyway, it’s a difficult to handle issue. If you have a better text in mind, let me know, and I may update it.

  3. Waldemar Hamm reporter

    It’s not so much about employees getting rid of their personal mail but more about customers for example sending applications not to applications@domain.tld but to invoices@domain.tld which have a totally different retention time (no archive/few months vs 10 years). If a customer makes a mistake you’ll have to change the retention time or delete the mail without the customer even asking for it. But a customer can ask for deletion of any personal data and you will have to comply.

    Actually ENABLE_DELETE or a way of changing the retention data for single mails or in bulk should be on by default for GDPR-related countries. The correct way IMHO to handle this is providing detailed information (excluding identifying data like mail addresses of the From: address) about the deletion in the revision log, so that an external auditor can verify the deletion happened because of GDPR regulations. Reasons for deletion should have to be provided.

    Even better would be a way of changing the retention data (which would imply deletion if you set it to 1 day or something), so you can change retention of mails that were wrongly filed (because customers messed up the address they sent the mail to for example) - it should be possible to do this in bulk. This, together with required provision of a retention data change reason which is logged in the revision log, would be an ideal solution IMHO.

    I don’t think that any warning wording would be necessary. Deletion of single or even chosen bulk mails due to GDPR compliance is a necessary tool to have a GDPR-compliant mail archive IMHO. In my eyes running piler without ENABLE_DELETE in a GDPR-enabled country is actually the non-compliant version currently.

  4. Janos SUTO repo owner

    OK. I’ll fix it, so there won’t be “the archive is no compliant” message, and add a mandatory field before accepting the removal.

  5. Waldemar Hamm reporter

    By mandatory field before accepting the removal you mean something like reason=? which will also be displayed in the revision log? That would be great!

    • Bonus points if it would be possible to make the deletion a process that would optionally need approval of multiple users (admin/auditor + data security officer would be just perfect - by the way: a data security officer role that is allowed to enforce GDPR-driven deletion/retention time change by default would be ideal - data security officers enjoy a special standing within a company and get all the necessary law-considerations to enforce data security even if the CEO wouldn’t like it, without the fear of getting fired instantly).
    • Also huge bonus points if instead of simply deleting a mail you could instead change the retention data manually instead of plain deletion - there would be no need to set an appointment for manual deletion or forwarding to another mail address for correct archiving then if a mail was wrongly addressed to an address that is bound to a smaller retention time than the address it was meant for. Only with the discussed revision logging, of course. (Old retention time + reason of change)
    • Even more bonus points if you can do this in bulk for multiple mails - consider this rather common use-case for GDPR-aware people: john.doe@customer.com doesn’t want to be your client anymore and demands that you delete all data you have of him (including mails). Now you need a way to enter your mail archive and delete all mails that are from and to john.doe@customer.com but if and only if other laws (like tax laws) don’t require you to keep the mails. So basically you will need a way of filtering the mails and checking off the mails that need to be deleted and then perform a bulk operation. If you need to look through hundred mails and delete them mail by mail you’ll go insane.

    What about the addresses involved for the mail? I’m not sure yet if even a partial masking of the mails in the revision log would suffice. Theoretically someone could run a mail server with a specific domain as a single person/user - only the domain of the address could then be used to determine which natural person sent the mail, thus you wouldn’t delete all personal data when you delete the mail. It would live on in the revision log.

    The only safe option in my head seems to completely mask away mail-addresses that belong to external domains and mask away at least the user-portion of internal/known domains for deletions/retention time changes in the revision log. Sure, an auditor can almost not tell anything about the mail that was deleted from the revision log then, but that’s the contradiction we are facing with the GDPR. 🤷‍♂️

    EDIT: IMHO these features would significantly raise the GDPR-compliance of Piler which should in turn raise the awareness of this project in the EU.

  6. Waldemar Hamm reporter

    Thank you very much Janos! Will definitely drop some € your way if you can make at least a significant portion of my suggestions happen. Do you have a PayPal donation set up?

  7. Janos SUTO repo owner

    I had some progress in the meantime. There’s a new role: data officer. There’s a list of pending messages to be deleted (eg. request date, user, description of why remove this message, and the message number).

    Now (at least for now) the data officer doesn’t have auditor privileges, so he can’t see the message. Do you think it’s ok to show him the message number and the request description why remove such message, and then allow him to delete the given message?

  8. Waldemar Hamm reporter

    Hi @Janos SUTO , the data officer should be able to verify if the deletion request is legitimate. I think we can safely assume that for the sake of proper deletion the data officer should be able to view the whole E-Mail in question that’s to be deleted. That way the data officer can verify that it’s not a mail that might serve as proof for, let’s say illegal activities, that someone else in the company or even an outsider (possibly via hacking of a random user account in the company) tries to get deleted.

    In accordance to this, the data officer can also check if there might be legal reasons that work against a deletion. It’s the data officer’s job to know if the content of a mail creates a legal obligation to archive the mail for X months/years regardless of a users/customers wishes. This is only possible if the data officer has full access to the mail, so you can legitimate full view of the mail by Article 6, 1c and 1d of GDPR.

    Other than that

    • request date,
    • user that requests deletion and
    • reason for removal

    are good data fields. If possible I’d recommend not only a text field for the reason but also an optional data field where you can upload a file, like PDF, as proof of reason.

    After deletion the auditor, on the other hand, should be able to see

    • request date,
    • user that requests deletion and
    • reason for removal and/or
    • file upload and
    • the username of the data officer that approved of deletion and
    • the date of this approval and the deletion. (Big companies have multiple data officers; an auditor should be able to see who signed off the deletion.)

    All of this is IMHO of course.

  9. Janos SUTO repo owner

    Thanks for the feedback. Your point regarding the data officer makes sense. Currently only auditor are allowed to promote a message for deletion, regular users aren’t. Anyway, I’ll try to apply your ideas to the feature. However, for starters I’d not complicate with attaching documents with a file upload feature. We’ll see that later.

  10. Waldemar Hamm reporter

    To me an auditor normally is an external person who wants to audit your stuff for legal reasons or due to regular checks. An “internal” auditor to me is essentially a data officer actually.

    It would make sense to use the auditor role internally for “elevated” users perhaps; this way you would have another pair of eyes to check if a deletion is a good idea. Perhaps this is a fitting role for a leading community manager or something.

    Attachment of documents might become a necessary thing, though. Without it you will need to keep the proof of legitimate deletion due to request elsewhere for an external audit. It would help to incorporate an encrypted, unchangeable file within the deletion request to keep the deletion request and the document that requested it in the same place.

    What if someone requests deletion of all his/her data via post mail? To follow this request you would also have to delete a lot of mails where personal data of that person was included and each and every time the written/printed document would be the proof of legitimate deletion request. You could write “See document XYZ.pdf in our deletion request folder on drive E:\” or something but it would be a lot more comfortable if you could just attach the file there.

    It's not necessary to overcomplicate this; why not be able to create a rule like "if mail receiving address is delete-request@piler.mydomain.tld and mail sender address contains known domain then save the message indefinitely together with its attachments (in a special folder perhaps?)” - this way you could bundle the delete reason and proof of legitimation in a process that’s already there → mail archiving.

    Workflow then would be like this:

    1. Someone asks for deletion of his/her data/mail or a mail went to a wrong address and thus receives a wrong retention time.
    2. User within known mail domains of your piler instance sends a mail to delete-request@piler.mydomain.tld (bonus points if you can customize this functional mail address) with details on what to delete and why (with possible attachment)
    3. Data officer gets a notification (if possible. Maybe via automatic forwarding?) and can approve or disprove the request

      1. If disproved, basically nothing happens; you could leave a textfield or something depicting that this request has been disproved.
      2. If approved, you get transported to a form where this mail is already chosen (or remembered) as delete reason and proof of legitimation and you need to choose the corresponding mail(s) that need to be deleted for it; if possible within a search form and with bulk options, because requests for deletion of several mails will happen.
    4. Auditor can see disproved and approved deletion requests together with which data officer decided on it and on what date.

    EDIT: Watch out for circular dependencies, though. It could happen, that mails to delete-request@ need to be deleted themselves at a later date or even instantly, because someone put sensitive data into the delete-request itself. Also mails to delete-request@ should be only visible to the data officer in full content view IMHO.

    EDIT2: Oh and if at all possible I’d really wish to not be able to only delete mails (aka set retention to 1 day from now) but set retention time manually (set retention to X days from now or specific date) for all the cases where a mail was archived with a wrong retention time due to human error.

  11. eXtremeSHOK

    @Waldemar Hamm

    I have a completely different view of the GDPR and I am more concerned with how it facilitates in allowing for the cover-up of fraud. eg. a user requests for their data to be purged from the archive, systems etc, with the current state of auditing (external) it would take a minimum of 2 years to uncover the fraud.

    All the information is wiped or essentially removed. So all the GDPR has done is allow someone to cover their tracks. It allows for an employee to resign and then request for all their data to be purged. It allows for a supplier or vendor to request all their data be purged.

    Some systems will remove or sanitize all data transactions, so with the emails also being purged there is no trace and no evidence. Think along the lines of state capture, it would take a minimum of 7 years for the data to be requested. By then its long gone, so GDPR is directly allowing companies and governments to cover-up fraud.

    My thoughts on how GDPR should be handled in email archiving systems.

    A special GDPR mode and role is required. This will allow for a users' emails to be hidden entirely from the system and stored separately with another layer of encryption, so they can only be unlocked and accessed by the data protection officer.

    So for all intents and purposes the emails are removed from the system, they cannot be viewed, searched, etc. But they are still available for future use, as long as they are unlocked. So for all intents these GDPR emails will become "binary blobs" and one can enforce DRM over these binary blobs which would override the GPDR.

    The data protection officer with a valid reason would be able to restore these "binary blobs" back into the emails on the archiving system. No one else would be able to. The data protection officer would not be able to view these protected emails unless they are unlocked, doing so would then place a marker in the system (who unlocked, reason and timestamp). So the only person violating the GDPR would be the data protection officer, and if there is a valid reason (fraud investigation, legal, etc) the GDPR violation would be invalidated.

    I am of the firm opinion, archive data should NEVER be deleted, irrespective.

    Generally its the only and last line of defense against corruption and fraud.

  12. Waldemar Hamm reporter

    @eXtremeSHOK I’m sorry but there are already pretty clear decisions on what “deletion” in context of GDPR means and this is at the very least complete anonymisation of data so that it’s not possible (also not with more tools and more data from elsewhere) to know who is linked to the content presented. Here’s a German read-up on this: https://www.e-recht24.de/news/datenschutz/11269-dsgvo-daten-loeschen-heisst-nicht-unbedingt-daten-vernichten.html

    Anonymization instead of “real” deletion could already leviate some of your concerns but it’s not possible to do this the easy way in piler. Sure, you can abstract away mail addresses and everything but you can’t scan the mail content for personal data with a simple algorithm.

    Not deleting data even though GDPR forces you to is violation of the GDPR which can cost you a lot of money as we’ve seen with recent lawsuits in Great Britain. You can’t keep the data just because you might THINK it would be useful as proof against fraud at some point. If the data the data officer is looking at looks fraud-ridden, then that is the moment where action needs to be taken. That’s why the data officer is not bound by instructions of his company by law and can’t be fired on a whim.

    Declaring legal necessity of data officers starting from a certain company size that

    • work directly under the CEO
    • don’t have and shall not listen to his orders regarding data privacy and
    • can’t be fired on a whim

    is exactly the reason why GDPR is not that easy to misuse for fraud in companies with a size where fraud would hurt the community as you are making it look like.

    This is drifting into legal consultation which obviously none of us can’t provide; but from existing GDPR comments it’s pretty clear that “deletion” which doesn’t at least anonymize the data is definitely not a “deletion” and makes you susceptible for lawsuits. No data officer should want to work under such conditions where he/she can’t fulfill his role properly, since he/she would be the one to risk his/her existence.

    We’re not the ones to decide that GDPR is working against the government in some points; we’re bound to follow the GDPR, though. Let the guys at the EU rethink this when they realize the conflict between GDPR and restorable proof of illegality if this is even such a huge problem at all.

  13. Janos SUTO repo owner

    Both of you have raised very valid points. For starters, I believe that (in general) an archive should be read-only or more precisely append-only, ie. you can’t alter or remove anything. There can be well established retention rules set based on your industry, legal obligations, etc. allowing you to remove an aged email from the archive to spare some disk space.

    I also believe that employers should make it clear to employees that they are allowed to use their company emails for business purposes, and all employee should abide (not a single personal email to a corporate address!), not to mention that all emails are archived. I’m aware of that even in such situations there are always some dumb users receiving personal data to corporate emails.

    I’m not a lawyer (so it’s not a legal statement), but GDPR doesn’t say you cannot handle personal data. In fact, you can as long as you can explain why (on what ground) and how long. GDPR demands that you don’t store a personal data which is not necessary, and don’t store a personal data any longer than it’s necessary.

    So I’m not a big fan of removing a user’s all (business) emails because he says so. @eXtremeSHOK is rigth when he says it makes corruption and covering tracks easier. In an ideal world the data officer can’t be bought or made corrupted by any chance. On the other hand I acknowledge the fact that it’s your archive, your emails. If you think it’s fitting to remove lots of emails on whatever ground, I can live with it. My personal opinion is that the moment you enable ENABLE_DELETE flag, you archive is not compliant any more.

    Anyway, a few comments to your feature lists:

    • I won’t complicate it with an upload field. At least not for now
    • The delete-request@ address shouldn’t be a piler specific email address. Such request should go to the mail server
    • I don’t intend to mess with any circular dependencies. Piler should treat it as an ordinary email. In fact it shouldn’t be ever removed, just as you said
    • I don’t plan to add a retention fixing feature to the gui. The retention policy should be a well thought, compliant rule set to consider both the company’s intrest as well as the legal landscape, and not fixing it for fun. Anyway, it’s not impossible to fix the retention value for the given message by some sql statements on the piler database

    Hiding a message could be done. Another encryption layer is possible, though not sure if that should be the way. Anonymization sounds good, but I think it’s pointless: to do so you have to either modify the message or process/transform it, and filter or mask any sensitive stuff leaving the original email intact. The first defeats the purpose of an archive, the latter won’t make EU lawyers happy.

    Anyway, I’ll keep working on it, and try to come up with something still usable. You’ll have plenty of opportunities to shape this feature.

  14. Waldemar Hamm reporter

    In an ideal world the data officer can't be bought or made corrupted by any chance. - If it’s possible to prove that the data officer was the one to delete the mail (and not the server admin in control of the piler instance), then the data officer would be (heavily) responsible for the lawsuits coming to the company due to corruption or whatever; there are many ways to make 100% sure that the data officer knows and approves of the deletion, like requiring a confirmation by mail to the mail address of the data officer.
    If there is a data officer in this world who will risk his/her private life for the sake of his/her company, ok fine - why shouldn’t it be possible to influence the Piler hoster in the same way?

    Problem is: GDPR requires you to follow right of deletion. It’s not your employees fault if a private person decides to write personal stuff to a wrong mail and it’s not your IT admin's fault if that mail gets archived, because retention rules for this receiving mail address say so.

    It’s the fault of piler, though, if

    1. you can’t delete such mails or change their retention times because the mail was wrongly sent and should be retentioned for 6 and not 10 years
    2. piler says, that a piler instance which can delete mails is “not compliant”; it’s the other way round: no ability to delete mails that you were asked to is not compliant with the GDPR; this would be a direct contradiction to article 17 GDPR - there’s no clearer evidence of “non-compliance” than this.

    • Upload field would have easened the burden of article 17 related erasure requests that came in by paper and not by mail; this makes it harder to use Piler for such cases but not impossible - there’s just a point where a mail archiving solution may become useless, when low-hanging-fruit quality-of-life improvements keep being ignored
    • Sure, the delete-request@ address can be organized within the actual mail server
    • There can never be mails that you receive from outside of your organization that you can not delete. This is a direct violation of article 17. ALL incoming mails may be mails that you will have to delete one day, because you have almost zero control over what the content of mails entail that are being sent to you or your company from the outside. The only exception are mails that you have to archive by laws like tax laws. Article 17 stands over parts of Article 6 which you are quoting as In fact, you can as long as you can explain why (on what ground) and how long. The only exceptions are listed in Article 17 3
    • How do you set a retention rule for invoices@company.com (if invoices have to be archived for 10 years in a way), that when a dumb customer writes a mail to invoices@company.com about a general quote question (which should be archived for 6 years instead), will be treated properly? If I can only choose between letting it be as-is and delete it now, then I can’t do the proper thing in one go: Change the retention to 6 years instead of 10.

    Hiding a message sadly isn’t enough in terms of GDPR; the definition of erasure is ruled as the minimum of complete anonymization of the data with no possibility of reconstruction. If anonymization is not possible in an automated way (which I think is the case for mail related stuff) then erasure means to really remove all of the data that could contain personal data completely.

    Don’t get me wrong; I’m not a fan of the GDPR at all. Due to having lawyers in my family and having completed multiple seminars on the DSGVO/GDPR I think I have a pretty good grasp on what Piler should accomplish here at the very minimum. The very minimum in my eyes would be (documented with reason, although without personal external data from the mail, which would violate GDPR again) free deletion of mails in Piler exclusively with approval of the data officer role (which doesn’t have to mean that the data officer is also the only one to recommend a specific deletion). Change of retention could be achieved by forwarding to the correct address + deletion; requests for erasure by paper could be documented away somewhere and mentioned in the delete reason; there would be also other, complex workflows to make everything work - it would just be a minimum.

    The bread&butter-problem here is that personal-data-free deletion is not possible yet and that deletion is regarded as “non-compliant”, which is a direct contradiction of Article 17 GDPR. Everything else would make Piler a more easy to use package, but not a requirement.

  15. Janos SUTO repo owner

    I’ve merged to the master branch the current progress. There’s a new role: data officer. After logging in, he’s redirected to the page showing all emails marked for deletion. Then he can either approve the request, and delete the email, or he can reject it with a short description why not. There’s a new table (called deleted) keeping track the message id, requestor, reason, date of the request itself, and also date, user, and reason2 of the data officer’s decision.

    There’s a new config variable: NEED_TO_APPROVE_DELETE. If set, then a data officer must approve all deletion requests, otherwise an auditor can remove any email on his own.

    For the moment there’s no attachment field for an external document, whatever, and there’s no bulk approval for the data officer.

    You mentioned a scenario where you have to remove a user’s all emails. However you also said except if other laws prohibit it. I don’t think a data officer will review 2-300,000 emails and find the 5-6 exceptions. Anyway, not sure how to solve this yet.

    So I’d like you to test the current result by deploying the master branch, and please provide a feedback how it looks at the moment. Then we’ll discuss how to continue.

  16. TM

    Hi,

    first I’d like to add: @Waldemar Hamm is completely right: an option to delete messages is absolutely necessary. Otherwise the archive is not GPDR compliant.

    Bugs:

    I’ve enabled ENABLE_DELETE and NEED_TO_APPROVE_DELETE in config.php and did a short test. Things I’ve noticed:

    • as auditor after selecting an e-mail for deletion there is nothing shown in the UI for this mail. There should be a flag show, just like hte “P” flag for private
    • audit log shows an empty entry for deletion request: nothing in “actions” and “description” column. Just a reference link to the e-mail.
    • data officer’s reject popup window shows an “delete” button instead of “reject” button
    • a deletion reject does not create an entry in audit log at all
    • after deletion of an e-mail by the data officer, the auditor still can click the reference link to the mail in the audit log and see the message

  17. Janos SUTO repo owner

    Thank you for your feedback. I’ve added the missing action descriptions, as well as a red “R” mark similar to the private flag.

    The reason for the auditor still can see the message after the data officer has removed it is that the message is not removed from the system yet. It will be unavailable only when the purging utility runs the next time.

  18. Log in to comment