GDPR

Next meeting
Next meeting of the people involved is scheduled on May 4th 2018 at 11:00 UTC in xmpp:xsf@muc.xmpp.org

Introduction
On may the 25th of 2018 the new EU General Data Protection Regulation (GDPR) will be enforced. This may pose problems for the XSF and for the public federated XMPP network, as discussed in the XSF board: https://trello.com/c/t79C3Yds/307-gdpr-advice

This page is a growing page, collecting the research done on the consequences of the GDPR for XMPP and the XSF. There are three fields where the GDPR probably will have an impact and that will be of concert to the XSF:
 * 1) The public (federating) XMPP network
 * 2) The XSF run XMPP server
 * 3) The functioning of the XSF, like the membership applications and the voting

Methodology
We decided to roughly follow the lines of the Data Protection Impact Assessment (DPIA) as mandated by the GDPR: Collaboration with IETF was mentioned during previous board meeting that started this ad-hoc group. Who is working on it, (how) should we collaborate with them?
 * 1) Check if the GDPR is applicable (jurisdiction)
 * 2) List what data is processed
 * 3) List what processing is done
 * 4) List legal grounds for the processing
 * 5) Analyse possible consequences

Wrap up
A structured wrapup in a table is visible here: https://wiki.xmpp.org/web/GDPR/Table

Q1: What consequences does the GDPR has for the XMPP network, XMPP server operators and what can/should the XSF do with that?
General note: The legal entity (person, company) that is responsible for a XMPP server is a 'Controller' of data. A hosting service (or other services) are 'processors'.

When federating, data is transferred from one controller to an other.

Q1.1a Check if the GDPR is applicable (jurisdiction)
The GDPR is applicable to anyone offering services from EU, or to EU citizens, paid or non-paid and to anyone explicitly targeting EU inhabitants.

C2S:

 * Credentials
 * User metadata
 * IP address
 * presence, timestamp of last available presence
 * User content:
 * roster content (with names)
 * bookmarks
 * offline/MAM history
 * server-side file storage (http-upload, etc.)
 * PEP
 * Server logs

S2S:

 * s2s meta-data (IPs, hostnames, sessions, server logs?) - GDPR probably doesn't apply
 * user meta-data (presence, subscriptions, message routing)
 * user content (messages, pubsub, etc.)
 * MUC history, MUC MAM
 * Remote components (e.g., roster management)

Q1.1c List what processing is done
Note: Storage is considered as processing under art. 4.2.

C2S:
Common use cases:
 * Credentials:
 * minimal: stored as long as the account exists
 * typical: check user JID against well-known spammer patterns
 * User metadata:
 * minimal: stored during connection
 * typical: stored with account, spam detection, expose to other users (presence, last activity)
 * User content:
 * minimal: roster/bookmarks with account, PEP in RAM only, offline messages until first client connects
 * typical: with account, MAM/files for a given amount of time
 * Server logs:
 * minimal: no logs
 * typical: some days / weeks of logrotate, maybe with IP addresses / message metadata (spam detection)

S2S:
Outbound:
 * s2s meta-data:
 * typically just inside of server logs. covered by r49
 * user meta-data:
 * minimal: handing over to receiving users server
 * typical: stored while receiving user is online (to avoid having to send out probes for new resources).
 * user content:
 * minimal: handing over to receiving users server if online; storage of roster-related things with account.
 * typical: minimal + offline-storage if offline or even MAM for undefined period of time for messages
 * MUC: is this different from plain s2s?
 * Remote roster management: the XEP already requires user consent

Inbound:
 * s2s meta-data:
 * typically just inside of server logs. covered by r49
 * user meta-data:
 * minimal: forwarded to receiving users connections
 * typical: stored while receiving user is online (to avoid having to send out probes for new resources).
 * user content:
 * minimal: forwarded to receiving users connections if online; storage of roster-related things with account.
 * typical: minimal + offline-storage if offline or even MAM for undefined period of time for messages
 * MUC: is this different from plain s2s?
 * Remote roster management: the XEP already requires user consent

Spam detection
Spam detection is not standardized, so hard to assess here. There are also many thin lines, like reading messages for manual tweaking of spam-rules, using machine learning and filtering on words that may indicate content from art. 9.1. Therefore we can not give a general answer about spam detection, but we do need to give some guidance on it.

C2S:
Art. 6.1b can be used as ground for processing, so the permission is implicitly granted when signing up for the XMPP service. The EULA must then contain information about the information processed.

MAM and MUC MAM are covered by explicit consent (6.1a)

S2S:

 * s2s meta-data:
 * typically just inside of server logs. r49 probably applies
 * user meta-data:
 * all transfer requires (implicit) user consent - by joining a MUC or sending a messages to somebody or accepting a subscription (art. 6.1b), outside the EU by 49.1b

Treat all S2S as outside the EU

Q1.1e Analyse possible consequences
(Work in progress)

C2S:
Preliminary notes: unauthorized persons, by the way it is implemented. It could therefore be argued, that such processing is contrary to what is required by article 9.
 * The processing of personal data to the extent strictly necessary and proportionate
 * It could be argued that storing very sensitive personal information, albeit for a short time, unencrypted, visible to anyone with access to the backend server (and perhaps more), does not constitute proportional data protection measure, knowing how sensitive the information can be in some cases. It could therefore also be argued, that the processing “reveals” this information to
 * Even with consent, "proportional means of protection" is required, so encryption (i.e., full-disk) might be necessary to check that box. If user-sent content is subject to art. 9.1, then the "proportional" from "proportional means of protection" becomes harder.
 * Article 35?
 * Logs
 * See recital 49:
 * Data should not be stored for more time than necessary.

S2S:
Preliminary notes:
 * I think what we *at the very minimum* learn from this given the technical means in the XMPP network is: you absolutely must not do any kind of data mining on message content which might come from federation.
 * What I'd like to know more about is whether we need some explicit legal framework for handing off data, or if this is covered by the user's implicit consent of wanting the message delivered.
 * I wonder if we want a way to give consent to the processing done by an s2s domain. then there could be something pubsubby where clients can query which s2s domains the user consented with and show that in the UI. warn the user when sending a message to a non-consented domain with "review the privacy policy" and offer doing the in-band consent thing as per the EULA XEP.
 * I’d like to have a status code for [MAM MUC logging] because that could save us from 9.1 trouble (there’s something about "manifestly made public" in there, and if we can get clients to show "THIS ROOM IS PUBLICLY LOGGED", we’re out of trouble there I think), 170 or similar

Q1.2: What consequences does the GDPR has for the XMPP server operators
TBD

Q1.3: What can/should the XSF do with it?
TBD

Q2: What consequences does the GDPR has for the XSF run XMPP server?
TBD

Q3: What consequences does the GDPR has for the work processes of the XSF itself (membership, voting, wiki etc)?
Personal data the XSF holds:
 * Email of wiki users (for account creation)
 * Voting results, that could be considered as "political opinions".

The rest of the information given when applying for membership, (fullname, jid/email, employer, etc.) like everything else on the wiki, as well as messages on public MUCs, falls under art. 9.2 e): Processing relates to personal data which are manifestly made public by the data subject;

ToDo's

 * 1) Link with IETF and other projects with similar issues.
 * 2) Read chapter 5 about transfer of personal data to countries outside the EU
 * 3) Create document with guidelines for server operators containing:
 * 4) Limiting of processing because of limits of art. 6.1b and 49.1b
 * 5) The risk of 'triggering' art. 9.1 and the consequences of that
 * 6) Inform that their EULA should contain something about: "messages sent to other users are subject to policies those users agreed to" should be included in EULA during registration (TODO: probably s/messages/stanzas, and find end-user speak for "stanza")
 * 7) Limits in logging
 * 8) Create template for EULA
 * 9) should inform users about possible different privacy policies at different servers
 * 10) should inform users about different (MAM) setting by different users
 * 11) should inform users about publishing MUC logs
 * 12) should include information about S2S and possible jurisdiction changes

Technical ToDo

 * 1) EULA XEP for c2s
 * 2) Linking to EULA, also with IBR
 * 3) versioning
 * 4) to be determined: informing about EULA details
 * 5) MUC status code: 170 when MAM enabled. Also something to say if the logs are public or private.
 * 6) Write about default visibility in data policy
 * 7) JID: contacts, chatrooms and their server operators
 * 8) vcard avatar: always visible
 * 9) PEP avatar and other PEP things: most likely to your contacts
 * 10) PEP items visibility should be made explicit by the client to the user
 * 11) last online timestamp, status message, online status, list of online devices: contacts, chatroom participants?
 * 12) Add to MAM-XEP:
 * 13) Add a note to the MAM XEP about GDPR consent requirements.
 * 14) MAM XEP doesn't provide a way to differentiate between "explicitely set" and "enabled by default"
 * 15) Will MAM auto-purge if you disable? Get a mention of this in the XEP

LQ1 user-sent content and art. 9.1
Does 9.1 automatically apply to all (not e2e encrypted) user-sent content, or only if we are analyzing it for profiling/other purposes? Does using e2e encryption change this? opaque blob and don't analyse it, art9 doesn't apply, (See r51). Not sure how this plays with mod_firewall processing, spam filtering etc. So user content is NOT subject to art. 9.1
 * 1) Lawyer 1: Message content is similar to picture uploads. As long as we treat it as an
 * 1) Lawyer 2: 9.1 is not applicable because it is revealed by the user (9.2e).

LQ2 transfer to other controller and art. 6.1b / 6.1f
Can (implicit) consent as in art. 6.1b also apply to transfer to other controllers (as in other XMPP server operators)?
 * The transfer to the other itself can be covered by 6.1b (and 49.1b when transfer outside the EU), because it is necessary to deliver the service the user requested.
 * The processing on the other server can also be covered by 6.1b, but only as long as not further processing is done.

Contributors:
Ge0rG, jonasw, pep., peter.waher & winfried