15
Bart Custers PhD MSc LLM Associate professor/head of research eLaw –Center for Law and Digital Technologies Leiden University, The Netherlands Cyber Summit 2016 –Banff, Canada October 27 th 2016, 2:15 pm – 2:45 pm

Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse

Embed Size (px)

Citation preview

Page 1: Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse

Bart Custers PhD MSc LLMAssociate professor/head of research

eLaw – Center for Law and Digital TechnologiesLeiden University, The Netherlands

Cyber Summit 2016 – Banff, CanadaOctober 27th 2016, 2:15 pm – 2:45 pm

Page 2: Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse

� Introduction: big data and data reuse� Eudeco-project

� Generating new data vs data reuse

� Legal and ethical issues� Privacy, security

� Discrimination, stigmatization, polarization

� Consent, autonomy, self-determination

� Transparency, integrity, trust

� Suggestions for solutions� Conclusions

2

more data => more opportunities

This calls for data sharing and reuse

Page 3: Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse

� The Eudeco project (3 years)� Five partners

� Four countries

� Modeling the European Data Economy� Focus on big data and data reuse

� Legal, societal, economic and technological perspectives

3

Big Data• Volume (big)

• Velocity (fast)• Variety (unstructured)

Page 4: Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse

� People

� Social media

� User generated content

� Devices (Internet of Things)

� Sensors▪ Cameras, microphones

� Trackers▪ RFID tags, web surfing behavior

� Other▪ Mobile phones, wearables

▪ Self-surveillance/quantified self4

Page 5: Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse

Data sharing� Active role of data subjects

(hence: consent)

Data reuse(with/without consent)

� Data recyclingData reuse for the same purpose

� Data repurposingData reuse for new purposes

� Data recontextualisation

Data reuse in a new context

5

Data reuse may…• be more efficient

• be more effective (e.g., larger volumes, more completeness)

• include historical data• not always match purposes and

context• be difficult

• Technological(e.g. interoperability, data portability

• Legal(e.g. privacy laws)

• Economic(e.g. competition)

• Right to data portability• Right to be forgotten

Page 6: Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse

Facebook likes can predict:sexual orientation, ethnicity, religious and

political views, personality traits, intelligence, happiness, use of addictive substances, parental separation, age, and gender.

(Kosinski et al. 2013)

Legal perspective� Violations of privacy depend on your definition of privacy

Ethical perspective� Violations of privacy depends on your expectations.

� Subjective: personal expectations

� Objective: reasonable expectations

� Unwanted disclosure of information

� Security (hacking, leaking)

� Predictions

� Unwanted use of information

� Transparency regarding decision-making

� Function creep 6

informational privacy:

Which data are used? For which purposes?

Page 7: Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse

7

Page 8: Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse

8

Data may be discriminating:

� When police surveillance focuses on black neighborhoods, people in database will be black (selective sampling)

Patterns may be discriminating:

� Database may show top managers are male (self fulfilling prophecy)

� People causing car accidents are >16 years old (non-novel pattern)

Discrimination may be concealed/indirect

� Selection on zip code instead of ethnic background (redlining)

� Selection on legitimate attributes correlated to discriminating attributes (masking)

Discrimination

Stigmatisation

Polarisation

Page 9: Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse

Privacy policies/Terms & Conditions� People do not read policies

� Reading everything would take 244 hours annually

� Users are willing to spend 1-5 minutes on this

� Facebook: 9,500 words (>1 hour), LinkedIn: 7,500 words (~1 hour)

� People do not understand policies

� Policies are often highly legalistic, technical, or both

� Devil is in the details

� People do not grasp consequences

� Preferred option is not available

� Take-it-or-leave it decisions: check the box

9

informational self-determination (Westin, 1967)People control who gets their data and for which purposes

Page 10: Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse

10

Past Current Future?

� Big data is used for a lot of decision-making

� Based on what data?

� Based on which analyses?

Do you know in how

many databases you are?

Page 11: Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse

� Limiting Access to Sensitive Data� Basic idea is that if sensitive data are absent in the database/cloud, the

resulting decisions/selections cannot be discriminating

� However, restricting access is very difficult:� According to information theory, the dissemination of data follows

the laws of entropy:

▪ Information can easily be copied and multiplied

▪ Information can easily be distributed

▪ This process is irreversible

11

Page 12: Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse

Analyze the problem:

� Privacy Impact Assessments

Customize the solution:

� Privacy by Design� Privacy enhancing tools

� Privacy preserving big data analytics

� Discrimination aware data mining

12

Since there is not one problem, there is no single solution

Combinations of smart solutions are required

Page 13: Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse

New perspectives

� Focus less on:

� Limiting access to data

� Restrictions use of data

� Focus more on:

� Transparency

� Responsibility

13

Restricting data access and use limits big data

opportunities and is difficult to enforce

Page 14: Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse

� We need data sharing and data reuse

� There are risks, however, regarding

� Privacy, discrimination, consent, transparency

� These risks can be addressed via responsible innovation

� Privacy Impact Assessments

� Privacy by Design

▪ Privacy enhancing tools

▪ Privacy preserving big data analytics

▪ Discrimination aware data mining

� New approaches� Focus less on limitations of access to data and use restrictions

� Focus more on transparency and responsibility

14

Page 15: Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse

15

?

??

???

?? ?

Thank you for your attention!

Or contact me later: [email protected]