View
83
Download
10
Category
Tags:
Preview:
DESCRIPTION
Citation preview
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Joshua White whitej@ainfosec.com Senior Computer Engineer Assured Information Security http://ainfosec.com PhD Student of Engineering Science Clarkson University Date: Oct 31, 2012 Release: Unclassified // Public
Social Media Analysis and Privacy
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
About: Company
AIS (Assured Information Security)
Research and Development of technologies and capabilities to support effective operations within the entirety of the cyber domain.
Leading pioneers in the disciplines of Information Operations including Network Operations, Electronic Warfare, and Computer Network Operations of all types.
Located In:
Rome NY (Corporate Headquarters)
Portland OR
Baltimore MD
Beavercreek OH
San Antonio TX
Colorado Springs, CO
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
About: Speaker
Joshua White
Education:
AAS Computer Network Technology (FLCC)
BS / MS Telecommunications (SUNYIT)
PhD Student of Engineering Science (Clarkson University)
Experience:
7+ years Government Contracting in Information Security and Telecommunications Engineering
Areas of Study:
Intrusion Detection Systems
Optical Network Security
Large Dataset Analysis
Distributed Processing
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Overview: The Big Questions
Introduction
The Big Research Questions:
What are social media networks?
What is the privacy problem relating to them?
Who would want this data and why?
What rights of privacy must I protect?
What regulations regarding privacy exist?
What happens if I don't protect the privacy?
Conclusions
References
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
What are social media networks?
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Definition
Social Media Networks
DHS identified multiple categories [1]
Search
Video
Maps
Photos
Blog aggregates
Micro-blogs
Traditional social networks
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
What's the privacy problem as it relates to these social media networks?
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Problem
Two-Part Problem
End Users
Unsure or unaware on ways to properly protect their privacy
Data Collectors
Don't know how to properly maintain the privacy of their datasets
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Problem
Social Media-Networking Sites
Provide a communications method thought by many to be at least somewhat private
Many never change the default security settings associated with their accounts
Example: Percentage of Facebook users by age that change their account security settings to anything other then the default (no security) setting [2]
18-19 years old = 71%
30-39 years old = 67%
50-64 years old = 55%
80% of all users fall within the 18-64 age range
Estimated 20+ million users have no security but must still have a basic expectation of privacy
Provides the largest “Social Network” datasets available for study
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Who would want this data and why?
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Problem Focus: Data Collectors
The problem of user expectations and knowledge of privacy settings is for another discussion
Lets focus on the “larger” problem
Data Collection
What can we collect?
What can we do with the data?
How must we protect the privacy of an individual’s PII contained within the datasets?
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Social Media Networks Awareness
Benefits:
Government
Track locations of persons of interest with reasonable accuracy
“Bad guys” may have protected posts
Sometimes accessible by simply looking at their friend’s posts, or even other sites that they have allowed access within their accounts
Track trends
Who said what, who repeated it?
Is it going to cause a riot or worse yet, a war?
News before “official” reports
Natural disasters, shootings, etc
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Social Media Networks Awareness
Benefits:
Businesses
Directed advertising
Track locations of consumers with reasonable accuracy
Track buying habits and interests
Track trends
Who said what, who repeated it, is something going to effect a brand?
News before “official” reports
Did something happen that will effect the market rapidly
Natural disasters, news reports, etc
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Social Media Networks Awareness
Benefits:
Academia
Research
Track locations of subjects with reasonable accuracy
Track habits, interests and moods over time
Track trends
Who said what, who repeated it (graph theory)?
Study social networks with the largest datasets ever created
Collaborate with millions
Build prediction models
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Social Media Networks Awareness
It Concerns Groups Differently
Persons of interest
Don't want to be incriminated in things that you may not have done
Consumers
Don't want others to know things about their buying habits that can be used against them
Subjects
Don't want information released that might cause them to be judged by their peers
Some Concerns Everyone Shares
Discrimination
A feeling of (privacy) violation
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Case Study: Twitter
A real-time social media network of microblogs
Various API's
Search, Live, Historical
Highly accessible
Example: NodeXL offers a MS Excel plugin for quickly grabbing a few thousand samples a day from multiple sites
Large user base
65+ million “tweets” per day
750+ “tweets” per second
International Community
At least 27 languages represented
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Case Study: Twitter
Twitter is used by:
People
Every Day Individuals
Politicians
Celebrities
Professionals
Bad Guys
Objects
Gadgets that tweet (Sensors, bots, computers, spammers)
Labeled Nefarious Groups
Lulzsec
Anonymous
others
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Case Study: Twitter
What's accessible:
Posts contain far more then what's shown in the http://www.twitter.com web interface
Data is accessible as XML or in it's native JSON form
Data includes:
Location (Geo fields)
User names / real names
Threading
Track conversations using replies
Track re-tweets
Twitter client software data
Time stamping
Tweet text
And so much more
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Case Study: Twitter
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Case Study: Twitter
What can be done with all of this:
NYC Company DataMinr
Report the death of Bin Laden:
25 minutes after he was killed
13 minutes before the presidents address
They saw the first message regarding this only 19 minutes after it happened
They were able to trace even earlier messages that with the right algorithms would have shown something going on before the initial military strike
Reports of US helicopter flying over head
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Case Study: Twitter
Consequences
Data on sites like twitter can be used to:
Predict Social Security numbers with reasonable accuracy [4]
Deduce the gender of an individual from nothing but the message text [5]
Track a persons physical location and create predictable pattern maps
Deny services based on views and opinions expressed
Use posts, even those that were deleted as evidence in court [6]
So much more
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
What rights of privacy must I protect? &
What laws regarding privacy regulation exist?
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
First we need a strict definition for what is and isn't PII (Personally Identifiable Information)
PII is any information that can be used to identify a specific individual
This includes data that can be combined with other sources to identify an individual
Privacy Protection / Regulation
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Privacy Protection / Regulation
You decided to use this data, what's next?
Protecting the PII of individuals within the dataset is key, and to some extent dependent on who you are
We're back to:
Government
Businesses
Academics
Let's concentrate on US law during the rest of this talk
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
The US Government
Must protect the privacy of its citizens
Federal: Cannot collect data on citizens without a warrant
States: Cannot collect data on citizens without just cause
Cannot deny citizens the right to use social media networks
Cannot enforce privacy on the individual
Can enforce regulations on the social media companies and those who use the data
Privacy Protection / Regulation
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Businesses in the US
Must protect the privacy of consumers
Must abide by regulations imposed by the government that the site is located within
While not required by law, it's good practice to let consumers know what is being done with their data
Privacy Protection / Regulation
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Academics in the US
Must protect the privacy of subjects
This applies even in instances where data is gathered without consent, such as from social media network sites
Consent is not required for the collection of information from these sites
Depending on the specific sites EULA, datasets may:
Not be shared with other researchers outside of the organization
Not be duplicated within a publication
Summation through statistics and results is OK
Privacy Protection / Regulation
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
What happens if I don't protect the privacy of the individuals within my datasets?
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Consequences
There are obvious legal ramifications for not protecting the privacy of individuals within a dataset
Legal (Federal / State)
Legal personal injury
Not so obvious
Loss of consumer trust / support
Loss of position through ethics violation clauses
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Consequences: Example
Ethics can be tied closely to privacy
Harvard researchers accessed complete Facebook profiles of 1700 students [7]
Data consisted of public profiles collected within the university
Researchers outside the university had to apply for access to the data
Data manual contained statistics about the dataset that did not require the application to be filled out
These statistics were used to identify individuals
Consequently researchers lost funding and the University found that opinion of the school had lowered
Researchers were put before the ethics board Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Conclusion
Social media network datasets contain PII
PII is not just profile data, it's also unseen fields such as geo-location and data that can be derived from the messages posted
Datasets can not be shared outside an organization without prior permission if required by the EULA
If the EULA allows for sharing of the data, it still must be properly anonymized
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
References
[1] DHS, Office of Operations Coordination and Planning, “Publicly Available Social Media Monitoring and Situational Awareness Initiative,” June 22, 2010
[2] “Vaidhyanathan, S.; , “Welcome to the surveillance society,” Spectrum, IEEE , vol.48, no.6, pp.48-51, June 2011 doi: 10.1109/MSPEC.2011.5779791
[3] Brodkin, Jon.; , “Bin Laden death-detecting analytics services signs partnership with Twitter,” ArsTechnica, Apr 9 2012
[4] Alessandro Acquisti, Ralph Gross.; ,“Predicting Social Security Numbers from public data”, Proceedings of the National Academy of Sciences, vol. 106, no. 27, July 7, 2009.
[5] Burger, John., Et. All.; , “Discriminating Gender on Twitter,” Mitre Corp, Nov, 2011
[6] Smith, . ; , "No warrant needed, no privacy: Judge rules even deleted tweets can be used in court," Network World, Apr. 24, 2012
[7] Parry, Marc., ; , "Harvard Researchers Accused of Breaching Students' Privacy," The Chronicle of Higher Education, July 10, 2011
Copyright 2012 Assured Information Security, Inc.
Unclassified // Public Release
153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com
Questions
Copyright 2012 Assured Information Security, Inc.
Recommended