48
Anonymisation: How Anonymous Is Anonymous?

Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

Anonymisation: How Anonymous Is Anonymous?

Page 2: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

Panel

Lindsey Greig CEO

DataGuidance

William Long Partner

Sidley Austin

Uwe W. Fiedler Global Privacy Officer & VP

PAREXEL

Mark Elliot Senior Lecturer

University of Manchester Director

UK Anonymisation Network

Page 3: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

BEIJING BOSTON BRUSSELS CHICAGO DALLAS FRANKFURT GENEVA HONG KONG HOUSTON LONDON LOS ANGELES NEW YORK PALO ALTO SAN FRANCISCO SHANGHAI SINGAPORE SYDNEY TOKYO WASHINGTON, D.C.

Anonymisation: How Anonymous is Anonymous? The EU Legal Position 1 May 2014 William Long [email protected]

Page 4: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

Anonymisation under the EU Data Protection Directive

• What is anonymous data? – The Data Protection Directive does not define anonymous data – The Article 29 Working Party in Opinion 4/2007 on the Concept of Personal

Data) defines anonymous data as “any information related to a natural person where the person cannot be identified, whether by the data controller or by any other person, taking account of all the means likely reasonably to be used either by the controller or by any other person to identify that individual”

– EU Data Protection laws do not apply to anonymous data: • Recital 26 of the Data Protection Directive: “...whereas the principles of

protection shall not apply to data rendered anonymous in such a way that the data subject is no longer identifiable...”

Page 5: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April

2014 • Anonymisation is a technique applied to personal data in order to

achieve irreversible de-identification • Anonymisation is a form of processing and a legal basis for such

processing is legitimate interest if compatibility test is performed • Removing directly identifying elements is not enough additional

measures to prevent identification should be taken • There is no prescriptive standard of anonyisation but there are

two approaches (i) Randomisation; and (ii) Generalisation • Pseudonymisation reduces linkability of a dataset with the original

identity of a data subject but it is NOT a method of anonymisation • Good anonymisation practices – identify new risks, assess if

controls are sufficient and monitor the risks of identification

Page 6: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

How to Anonymise Personal Data?

• National Guidance – UK: The “Motivated Intruder Test” introduced by UK ICO Code of Practice

“Anonymisation: managing data protection risk”: • The “motivated intruder” is taken to be a person who starts without any

prior knowledge but who wishes to identify the individual from whose personal data the anonymised data has been derived. The test assesses whether the motivated intruder would be successful

– France: CNIL, the French Data Protection Authority, in its 2010 guidance on security of personal data, includes a section on anonymisation which: • Outlines basic measures for anonymising data, including generating a

“secret” code of the appropriate length and complexity, applying a “one-way” function to the data, and setting up organisational measures in order to guarantee the confidentiality of the “secret” code if it needs to be preserved

Page 7: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

How to Anonymise Personal Data • International Guidance: mainly for sensitive/ health data:

– ISO Technical Specification for Psuedonymisation (Personal Health Information): • Contains principles and requirements for privacy protection using

pseudonymisation services and defines a basic methodology for pseudonymisation services including organizational and technical aspects

• Specifies a policy framework and minimal requirements for trustworthy practices for the operations of a pseudonymisation service

– US Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule Guidance: • Informs about the methods to achieve de-identification in accordance with

the Privacy Rule. The guidance refers to two methods that can be used to satisfy the Privacy Rule’s de-identification standard – “Expert Determination” and “Safe Harbour” which outlines 18 identifiers that should be removed

Page 8: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

Is Key-coded Data Personal Data?

Example Country

Is key-coded clinical trial data considered personal data

Italy In the context of clinical trial guidelines, the Italian Data Protection Authority considers that encoding mechanisms deployed by sponsor companies are not sufficient to anonymise the data to be processed

Spain In the context of clinical trial guidelines, the Spanish Data Protection Authority considers that the Data Protection Act applies to clinical trial key coded data if there is a possibility of re-identification

UK ICO guidance considers that where an organisation holds records which it cannot link, nor is ever likely to be able to link, to particular individuals, the records it holds will not be personal data. This will only be the case where it is unlikely that anyone else to whom the records may be released will be able to make such links

Page 9: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

Anonymisation and Pseudonymisation – proposed EU Data Protection Regulation

• Recital 23 – data protection principles do not apply to anonymous data. “Reasonably likely” test is retained but account should be taken of all objective factors, such as costs of and the amount of time required for identification taking into account available technology and technological developments

• Pseudonymisation – proposed Regulation introduces concept of pseudonymisation but fails to introduce a clear framework

• Recital 58a – profiling based on pseudonymous data should be presumed not to significantly affect data subjects

• Recital 122a – professionals who process personal data concerning health should receive, if possible, anonymised or pseudonymised data

• Article 4(2a) – ‘pseudonymous data’ means personal data that cannot be attributed to a specific data subject without the use of additional information, as long as such information is kept separately and subject to technical and organisational measures to ensure non-attribution

Page 10: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

Pseudonymisation – proposed EU Data Protection Regulation

• Article 23 – data controller and processor required to implement privacy by design which can include pseudonymisation

• Article 32 – no security breach notification required where technological protection measures render the data unintelligible to any person not authorised to access it

• Article 82a – Member States may provide exceptions from consent for statistical or scientific research if research serves a high public interest and data is anonymised or if not possible pseudonymised under the highest technical standards and measures are taken to prevent re-identification

• Article 83 – personal data can only be processed for scientific research where data enabling attribution to an identifiable data subject is kept separate from other information under the highest technical standards and measures are taken to prevent re-identification

• Article 79 – fines of up to 5% of annual worldwide turnover shall take into account factors including technical and organisational measures

Page 11: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

BEIJING BOSTON BRUSSELS CHICAGO DALLAS FRANKFURT GENEVA HONG KONG HOUSTON LONDON LOS ANGELES NEW YORK PALO ALTO SAN FRANCISCO SHANGHAI SINGAPORE SYDNEY TOKYO WASHINGTON, D.C.

Comments / Questions Sidley Austin LLP, a Delaware limited liability partnership which operates at the firm’s offices other than Chicago, London, Hong Kong, Singapore and

Sydney, is affiliated with other partnerships, including Sidley Austin LLP, an Illinois limited liability partnership (Chicago); Sidley Austin LLP, a separate Delaware limited liability partnership (London); Sidley Austin LLP, a separate Delaware limited liability partnership (Singapore); Sidley Austin, a New York general partnership (Hong Kong); Sidley Austin, a Delaware general partnership of registered foreign lawyers restricted to practicing foreign law (Sydney); and Sidley Austin Nishikawa Foreign Law Joint Enterprise (Tokyo). The affiliated partnerships are referred to herein collectively as Sidley

Austin, Sidley, or the firm.

For purposes of compliance with New York State Bar rules, Sidley Austin LLP’s headquarters are 787 Seventh Avenue, New York, NY 10019, 212.839.5300 and One South Dearborn, Chicago, IL 60603, 312.853.7000.

200697673v3

Page 12: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED.

ANONYMISATION: HOW ANONYMOUS IS ANONYMOUS? EFFECT ON PHARMACEUTICAL INDUSTRY 01 May 2014 Uwe W Fiedler, Global Privacy Officer & VP DPP

DISCLAIMER: The views expressed in the workshop slides are purely those of the presenter and may not in any circumstances be regarded as providing legal advice or stating an official position of PAREXEL International Corporation.

Page 13: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED

AGENDA

• Pseudonymized data in Medical Research – Ethical Background

• Pseudonymized data in Medical Research – Data Flow

• Pseudonymized data in Medical Research – What are pseudonymized data

• 1st Summary

• Pseudonymized data in Medical Research – What are anonymized data

• 2nd Summary

13

Page 14: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED

PSEUDONYMIZED DATA IN MEDICAL RESEARCH – ETHICAL BACKGROUND

14

Data Protection & Privacy

Protection of personal data as human right

Pharmacovigilance Safety of the Patients

Declaration of Helsinki as a statement of ethical principles for medical

research involving human subjects

Clinical Development New and better drugs for patients

Page 15: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED

PSEUDONYMIZED DATA IN MEDICAL RESEARCH – ETHICAL BACKGROUND

• 1964 - World Medical Association (WMA) developed the Declaration of Helsinki - Ethical Principles for Medical Research Involving Human Subjects1

9. It is the duty of physicians who are involved in medical research to protect the life, health, dignity, integrity, right to self-determination, privacy, and confidentiality of personal information of research subjects. The responsibility for the protection of research subjects must always rest with the physician or other health care professionals and never with the research subjects, even though they have given consent.

24. Every precaution must be taken to protect the privacy of research subjects and the confidentiality of their personal information.

• Medical research contain the risk of adverse reactions - Therefore the research subject must be re-identifiable at each processing step by protecting the confidentiality of their personal data

• This is only possible via the use of pseudonymized data 1 64th WMA General Assembly, Fortaleza, Brazil, October 2013 - http://www.wma.net/en/30publications/10policies/b3/

15

Page 16: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED

PSEUDONYMIZED DATA IN MEDICAL RESEARCH – DATA FLOW IN MEDICAL RESEARCH

16

Page 17: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED

PSEUDONYMIZED DATA IN MEDICAL RESEARCH – WHAT ARE PSEUDONYMIZED DATA?

• The German Federal Drug Law (AMG) require the use of pseudonymized data:

§40 (2a) …The person concerned shall be informed especially of the fact that:

1. where necessary, the recorded data:

a) will be kept available for inspection by the supervisory authority or the sponsor's representative in order to verify the proper conduct of the clinical trial (means access to directly identifiable sensitive personal data),

b) will be passed on in a pseudonymized version to the sponsor or to an agency commissioned by the latter for the purpose of scientific evaluation,

• The German Federal Data Protection Act define pseudonymized data as follows:

§3 (6a) "Aliasing (pseudonymization / key-coding)" means replacing a person's name and other identifying characteristics with a label, in order to preclude identification of the data subject or to render such identification substantially difficult.

17

Page 18: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED

PSEUDONYMIZED DATA IN MEDICAL RESEARCH – WHAT ARE PSEUDONYMIZED DATA?

• As an example the Berlin Ethic Commission recognize research subject data only as pseudonymized if the data contain: • no initials (neither of the first name, nor of the surname) and

• if necessary month and year (01.2001) but no full Date of Birth (like 01.01.2001), in order to permit age verification.

• The actual Data Protection Directive do not define pseudonymized data but in 2007 the Article 29 WP stated in their WP 136

“…The pharmaceutical company has construed the means for the processing, included the organizational measures and its relations with the researcher who holds the key in such a way that the identification of individuals is not only something that may happen, but rather as something that must happen under certain circumstances. The identification of patients is thus embedded in the purposes and the means of the processing.

… In this case, one can conclude that such key-coded data constitutes information relating to identifiable natural persons for all parties that might be involved in the possible identification and should be subject to the rules of data protection legislation. …”

18

Page 19: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED

PSEUDONYMIZED DATA IN MEDICAL RESEARCH – WHAT ARE PSEUDONYMIZED DATA?

• ISO/TS 25237:2008 - Health informatics — Pseudonymization • Introduce the concept that in regard to personal data it is possible to differentiate

between the identifiable and content part of a data set:

– payload data is the part of the Personal Data that contain characteristics that do not allow unique identification of the data subject. Therefore the payload data themselves would only contain anonymous data

– identifying data is the part of the Personal Data that contain a set of characteristics that allow unique identification of the data subject

• Define therefore pseudonymization as particular type of anonymization that both removes the association with a data subject and adds an association between a particular set of characteristics relating to the data subject and one or more pseudonyms

• Note that the conceptual distinction between “identifying data” and “payload data” can lead to contradictions. This is the case when, for example, directly identifying data are considered “payload data”

19

Page 20: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED

PSEUDONYMIZED DATA IN MEDICAL RESEARCH – WHAT ARE PSEUDONYMIZED DATA?

• The European Commission introduced in their proposal for a EU Data Protection Regulation only indirectly pseudonymized data in Article 83:

(b) data enabling the attribution of information to an identified or identifiable data subject is kept separately from the other information as long as these purposes can be fulfilled in this manner.

• The European Parliament legislative resolution of 12 March 2014 define pseudonymized data as follows:

Article 4 (2a) 'pseudonymous data' means personal data that cannot be attributed to a specific data subject without the use of additional information, as long as such additional information is kept separately and subject to technical and organisational measures to ensure non-attribution;

• Unfortunately the proposal of the European Parliament did not introduce a clear framework for the use of pseudonymized data

20

Page 21: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED

PSEUDONYMIZED DATA IN MEDICAL RESEARCH – WHAT ARE PSEUDONYMIZED DATA?

• The European Council introduced in their proposal for a EU Data Protection Regulation so far at least a small framework for the use of pseudonymized data: • Article 23 recognizes pseudonymization as data protection by design and by default

• Article 30 recognizes pseudonymization as appropriate technical and organizational measures

• Article 32 exempts pseudonymized data from security breach notification to data subject

• Article 38 emphasizes the implementation of Code if Conducts for the use of pseudonymized data

• During the Creek presidency, the Council updated Article 26 to also allow Processors to demonstrate sufficient guarantees for their implementation of appropriate technical and organizational measures by adherence of the Processor to codes of conducts

• These ideas are underlined by their recently published “Handbook on European Data Protection Law”

21

Page 22: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED

PSEUDONYMIZED DATA IN MEDICAL RESEARCH – WHAT ARE PSEUDONYMIZED DATA?

• Handbook on European Data Protection Law, March 2014: 2.1. Personal data

» … » In contrast to anonymised data, pseudonymised data are personal data.…

2.1.3. Anonymised and pseudonymised data

… As pseudonymisation of data is one of the most important means of achieving data protection on a large scale, where it is not possible to entirely refrain from using personal data, the logic and the effect of such action must be explained in more detail. … Personal data with encrypted identifiers are used in many contexts as a means to keep secret the identity of persons. This is particularly useful where data controllers need to ensure that they are dealing with the same data subjects but do not require, or ought not to have, the data subjects’ real identities. This is the case, for example, where a researcher studies the course of a disease with patients, whose identity is known only to the hospital where they are treated and from which the researcher obtains the pseudonymised case histories. Pseudonymisation is therefore a strong link in the armoury of privacy-enhancing technology. It can function as an important element when implementing privacy by design. This means having data protection built into the fabric of advanced data-processing systems.

22

Page 23: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED

PSEUDONYMIZED DATA IN MEDICAL RESEARCH – 1ST SUMMARY

• In the absence of harmonized local Data Protection Acts, it is still unclear to what extent pseudonymized research subject data are personal data

• In case they are personal data, it is also unclear to what extend they could contain more identifiable data elements like initials and/or full DOB - e.g. for safety reasons to solve pregnant men problem

• The European Council recently published their Handbook on European Data Protection Law that strongly suggests the increasing use of pseudonymized data

• Beside the hope on an improved Data Protection Regulation, another potential solution for this problem could therefore be the implementation of a Pharmaceutical Code of Conduct that would harmonize the medical research related privacy requirements within the EU

23

Page 24: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED

PSEUDONYMIZED DATA IN MEDICAL RESEARCH – WHAT ARE ANONYMIZED DATA?

• Unfortunately the actual data protection framework within the EU is even more unclear in relation to the definition of anonymized data

• Does the ICO Anonymisation Code of Practice helps? • Case study 1: limited access to pharmaceutical data

In a clinical study, only key-coded data is reported by clinical investigators (healthcare professionals) to the pharmaceutical companies sponsoring the research. No personal data is disclosed. The decryption keys are held at study sites by the clinical investigators, who are prohibited under obligations of good clinical practice and professional confidentiality from revealing research subject identities. The sponsors of the research may share the key-coded data with affiliates overseas, scientific collaborators, and health regulatory authorities around the world. In all cases, however, recipients of the data are bound by obligations of confidentiality and restrictions on re-use and re-identification, whether imposed by contract or required by law. Given these safeguards, the risk of re-identification of the key-coded data disclosed by a pharmaceutical sponsor to a third party under such obligations is extremely low.

• You may remember that WP 136 and the Council handbook defined pseudonymized data of research subjects as personal data!

24

Page 25: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED

PSEUDONYMIZED DATA IN MEDICAL RESEARCH – WHAT ARE ANONYMIZED DATA?

• Let´s have a look to the other side of the Atlantic

• Since 1996 the US Federal Health Insurance Portability and Accountability Act of 1996 (HIPAA) differentiate between: • Protected Health Information (Patient name + medical records = sensitive personal data) • Limited Data Set (a form of pseudonymized data still covered by HIPAA) • De-identified Data (a form of anonymized data not any longer covered by HIPAA)

• Even if the majority of medical research activities of pharmaceutical companies are not covered by HIPAA, this classification framework is also used there as de-facto standard for de-identification of data

• The reason for this is a clear definition of data elements that (1) must be stripped of Protected Health Information to receive a Limited Data Set and (2) must be stripped of the Limited Data Set to then receive De-identified Data

25

Page 26: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED

PSEUDONYMIZED DATA IN MEDICAL RESEARCH – WHAT ARE ANONYMIZED DATA?

1. A Limited Data Sets (LDS) lacks 16 of the 18 identifiers itemized by the HIPAA Privacy Rule. Specifically, a LDS does NOT include the following identifiers:

An LDS may contain, for example:

26

Page 27: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED

PSEUDONYMIZED DATA IN MEDICAL RESEARCH – WHAT ARE ANONYMIZED DATA?

2. Two methods to achieve de-identification in accordance with the HIPAA Privacy Rule (§ 164.514)

http://www.hhs.gov/ocr/privacy/hipaa/understanding/coveredentities/De-identification/guidance.html

27

Page 28: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED

PSEUDONYMIZED DATA IN MEDICAL RESEARCH – WHAT ARE ANONYMIZED DATA? • (b) Implementation specifications:

requirements for de-identification of protected health information. A covered entity may determine that health information is not individually identifiable health information only if:

• (1) A person with appropriate knowledge of and experience with generally accepted statistical and scientific principles and methods for rendering information not individually identifiable:

• (i) Applying such principles and methods, determines that the risk is very small that the information could be used, alone or in combination with other reasonably available information, by an anticipated recipient to identify an individual who is a subject of the information; and

• (ii) Documents the methods and results of the analysis that justify such determination;

OR

28

Page 29: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED

PSEUDONYMIZED DATA IN MEDICAL RESEARCH – 2ND SUMMARY

• Also the definition of anonymized data is unclear in regard to medical research activities within the EU

• The pragmatically approach taken by HIPAA may be a solution

• But even de-identified research subject data in accordance to HIPAA could be hardly defined as anonymized data under existing and future EU Data Protection Framework

• But why not implement a pharmaceutical Code of Conduct that would also define the content and purpose for the use of de-identified research subject data?

• If such Code of Conduct could regulate the use of pseudonymized research subject data, then it should be possible to also regulate the use of even less identifiable data – the de-identified research subject data

29

Page 30: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED

THANK YOU

©2014 PAREXEL INTERNATIONAL CORP. ALL RIGHTS RESERVED. 30

Page 31: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

ANONYMISATION A VIEW FROM UKAN

Mark Elliot Manchester University

Page 32: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

Outline

What is anonymisation? Anonymisation as Statistical Disclosure

Control How might disclosure happen? How to avoid disclosure

Page 33: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

Anonymisation: Types

Three types/senses: Formal Anonymisation Includes Pseudonymisation Necessary but not sufficient

Functional Anonymisation Risk Management Disclosure control

Absolute Anonymisation Not possible for any useful data

Page 34: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

Anonymisation: Environment

Is Environment Dependent Governance Users Security Other Data

Deciding whether data is anonymised (and therefore non-personal without considering context is like measuring the sound of one hand clapping

Page 35: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

Anonymisation: Attacks!

Scenarios of attack To work whether some data is anonymised

first consider how to attack it. Then work the risk of those attacks succeeding

Page 36: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

Anonymisation: Attacks!

Scenarios of attack To work whether some data is anonymised

first consider how to attack it. Then work the risk of those attacks succeeding

Which brings us to the SDC framework….

Page 37: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

Statistical disclosure is itself an active research area…

Sub fields Disclosure risk assessment Disclosure control methodology Measurement of analytical validity Data Environment Analysis

All data types Typically Microdata and Aggregate data Business and Personal data Intentional and Consequential data

Page 38: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

How might a disclosure happen?

Imagine you are a “data intruder” What would you need to do in order to identify

information about individuals within the data? What might be your motivations?

Page 39: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

Scenario Framework

Motivation Means Opportunity Auxiliary (Key/Matching Variables) Target Information Attack Type Effect of Data Divergence Goals achievable by other means? Consequences of attempt Likelihood of attempt Likelihood of Success

Page 40: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

The Disclosure Risk Problem: Type I: Identification

Name Address Sex Age ..

Income .. .. Sex Age

..

ID variables

Key variables

Target variables

Identification file

Target file

Page 41: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

The Disclosure Risk Problem Type II: Attribution

High Medium Low T otalProfessors 0 100 50 150Pop stars 100 50 5 155T otal 100 150 55 305

Incom e levels for two occupations

Page 42: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

The Disclosure Risk Problem: Type III: Subtraction

High Medium Low T otalProfessors 1 100 50 151Pop Stars 100 50 5 155T otal 101 150 55 306

Incom e levels for two occupations

Page 43: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

The Disclosure Risk Problem Type III: After subtraction

High Medium Low T otalProfessors 0 100 50 150Pop Stars 100 50 5 155T otal 100 150 55 305

Incom e levels for two occupations

Page 44: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

Var1

Var2 A B

C 3 9

D 2 2

Var1

Var3 A B

E 1 10

F 4 1

Var2

Var3 C D

E 8 3

F 4 1

The Disclosure Risk Problem Type IV: Table linkage

Page 45: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

Var1 and Var2

Var3 A, C A, D B, C B, D

E 0 1 8 2

F 3 1 1 0

Original cell counts can be recovered from the marginal tables

The Disclosure Risk Problem Type IV: Table linkage

Page 46: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

The Disclosure Risk Problem: Other data types

Network data Qualitative data Mixed data – Jigsaw

identification Statistical models Maps

Page 47: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique

Summary

Statistical disclosure is a complex topic Still an active research field

As researchers using sensitive/personal data you will need to: Be aware of the issues and considerations

of statistical disclosure Be able to make principled judgements

about the disclosiveness of your output

Page 48: Anonymisation: How Anonymous Is Anonymous?Article 29 WP Opinion on Anonymisation • Opinion 5/2014 on Anonymisation Techniques adopted on 10 April 2014 • Anonymisation is a technique