18
Data Driven Societies: D efining Big Data & Redefining Privacy Professors Gaze & Gieseking

Bowdoin: Data Driven Socities 2014 - Defining Data & Redefining Privacy 2/10/14

Embed Size (px)

Citation preview

Page 1: Bowdoin: Data Driven Socities 2014 - Defining Data & Redefining Privacy 2/10/14

Data Driven Societies: Defining Big Data & Redefining PrivacyProfessors Gaze & Gieseking

Page 2: Bowdoin: Data Driven Socities 2014 - Defining Data & Redefining Privacy 2/10/14

Data & Information (Recap)

✦ Information society !

✦ Data vs. information !

✦ Information-as-freedom vs. information-as-control

Page 3: Bowdoin: Data Driven Socities 2014 - Defining Data & Redefining Privacy 2/10/14

Big Data & Privacy

✦ Ethical research !

✦ Data sample and data access

!

✦ Defining big data, defining privacy

daily.captaindash.com

Page 4: Bowdoin: Data Driven Socities 2014 - Defining Data & Redefining Privacy 2/10/14

Social Scientific Approach0. Identify an issue 1. Research question 2. Theoretical approach 3. Literature review 4. Methods 5. Analysis 6. Discussion 7. Conclusion

Page 5: Bowdoin: Data Driven Socities 2014 - Defining Data & Redefining Privacy 2/10/14

Social Scientific Approach0. Identify an issue 1. Research question 2. Theoretical approach 3. Literature review 4. Methods 5. Analysis 6. Discussion 7. Conclusion

Page 6: Bowdoin: Data Driven Socities 2014 - Defining Data & Redefining Privacy 2/10/14

Social Scientific Approach0. Identify an issue 1. Research question 2. Theoretical approach 3. Literature review 4. Methods 5. Analysis 6. Discussion 7. Conclusion

Page 7: Bowdoin: Data Driven Socities 2014 - Defining Data & Redefining Privacy 2/10/14

Social Scientific Approach0. Identify an issue 1. Research question 2. Theoretical approach 3. Literature review 4. Methods 5. Analysis 6. Discussion 7. Conclusion

Ethics, anyone?

Page 8: Bowdoin: Data Driven Socities 2014 - Defining Data & Redefining Privacy 2/10/14

The Future of NowThe Chronicle of Higher Ed The White House

Page 9: Bowdoin: Data Driven Socities 2014 - Defining Data & Redefining Privacy 2/10/14

Visualize This

How we handle to emergence of Big Data is critical. …it is still necessary to ask critical questions about what all this data means, who gets access to what data, how data analysis is deployed, and to what ends.

—danah boyd & Kate Crawford, “Critical Questions for Big Data” (2012)

Page 10: Bowdoin: Data Driven Socities 2014 - Defining Data & Redefining Privacy 2/10/14

Research Ethics✦ Information Review Board (IRB) ✦ Informed consent ✦ Risk ✦ Accountability

!

Page 11: Bowdoin: Data Driven Socities 2014 - Defining Data & Redefining Privacy 2/10/14

Sample

Page 12: Bowdoin: Data Driven Socities 2014 - Defining Data & Redefining Privacy 2/10/14

Sampling & Access

http://blog.globalwebindex.net/

Page 13: Bowdoin: Data Driven Socities 2014 - Defining Data & Redefining Privacy 2/10/14

Data Access: Twitter✦ API - application programming interface is the set of tools

developers can use to access structured data !

✦ “Firehose” of access: GNIP, DataSifter ✦ “Gardenhose" of access: 10% of public tweets ✦ “Spritzer” of access: about 1% of public tweets ✦ White-listed accounts: allowed access to certain subject matter

Page 14: Bowdoin: Data Driven Socities 2014 - Defining Data & Redefining Privacy 2/10/14

Data Rich and the Data PoorManovich (2011) writes of three classes of people in the realm of Big Data: “those who create data (both consciously and by leaving digital footprints), those who have the means to collect it, and those have expertise to analyze it.”

-boyd & Crawford (2012) !

✦ Data rich and data poor - research insiders and outsiders, respectively, who have varied degrees of access to data and the means to analyze it

Page 15: Bowdoin: Data Driven Socities 2014 - Defining Data & Redefining Privacy 2/10/14

Defining Big Data1. Large data sets that require supercomputers for analysis,

i.e., usually over 2gb (Manovich 2011) !

2. A cultural, technological, and scholarly phenomenon that depends on the interplay of the following: ✦ Technology: maximized computation power and

algorithmic accuracy ✦ Analysis: examining large data sets to identify patterns

to make claims ✦ Mythology: widespread brief that the larger the data set,

the more accurate the findings (boyd & Crawford 2012)

Page 16: Bowdoin: Data Driven Socities 2014 - Defining Data & Redefining Privacy 2/10/14

Defining Privacy

To be continued…

Page 17: Bowdoin: Data Driven Socities 2014 - Defining Data & Redefining Privacy 2/10/14

ScraperWiki SupportA clever and elegant solution to our problem of accessing Twitter data with a limited number of calls: !

1. Open ScraperWiki and view your table !

2. Download EVERY MONDAY !

3. Restart EVERY MONDAY (you will need to do this the first Monday of break too)

Page 18: Bowdoin: Data Driven Socities 2014 - Defining Data & Redefining Privacy 2/10/14

Next Class: Feb. 12✦ Today: big data, privacy, research ethics,

data rich vs. data poor !

✦ Quiz: terms / concepts coming via email !

✦ Readings: Pariser, Stray !

✦ Next class/lab: ✦ filter bubbles ✦ correlation/causation ✦ work with Twitter datasets ✦ continue learning R