Upload
brendan-rogers
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
Geo-Privacy in Data-Rich Social Media Environments
Marc P. ArmstrongProfessor and Collegiate Fellow, Geographical and Sustainability SciencesInterim Chair, Department of Communication StudiesInterim Chair, Department of Cinema and Comparative LiteratureThe University of Iowa (aka #1 party school in USA)Iowa City, [email protected]
Privacy Preamble I
All humans need, and expect, some measure of privacy
Privacy is fluid and may change substantially depending on age, culture, technological innovations and other factors (e.g., telephone wire taps were once unregulated)
Privacy Preamble II
“Right to privacy” in US Constitution was limited to search, but has been expanded by Supreme Court and is covered in other laws (e.g., FERPA, HIPPA)
Privacy is a foundational tenet of the Universal Declaration of Human Rights (Article 12)
Most people are unaware of “locational power” of smart phones and social media (USA Today 080813: 98% of cell users receive push alerts by location)
Focus here is on geo-privacy
Geo-privacy
Being secure from unwanted locational observation or tracking
Amassed location data allows for the creation of a detailed profile of individual behavior, including habits, preferences, and routines—private information that could be exploited and cause harm
Where you go implies what you do and who you are
Effects of Exposure
Social media users expose themselves to geo-privacy risks
Disclosure of information to third parties Consumer tracking & surveillance Identity theft (and burglary) Threats to physical safety
Is this a real concern? Yes.
Every day it’s something new…
http://spectrum.ieee.org/riskfactor/computing/it/it-hiccups-of-the-week-programming-error-exposes-up-to-187-000-indiana-family-aid-recipients/?utm_source=computerwise&utm_medium=email&utm_campaign=071
Facebook’s Graph Search
http://spectrum.ieee.org/telecom/internet/the-making-of-facebooks-graph-search/?utm_source=techalert&utm_medium=email&utm_campaign=080813
Public vs. “Private” Sharing People using social media control who can
observe what information, kind of Some is fully public, other items only
accessible to “friends” or friend-like people
But privacy policies change without full user comprehension (EULA!!!) and expose data to public inspection (Facebook!)
And providers and their consorts (e.g., app developers) constantly scrape data in an attempt to monetize
DropBox Weasel Word Privacy Geo-Location Information. Some Devices allow applications
to access real-time location-based information (for example, GPS). Our mobile apps do not collect such information from your mobile device at any time while you download or use our mobile apps as of the date this policy went into effect, but may do so in the future with your consent to improve our Services. Some photos and videos you place in Dropbox may contain recorded location information. We may use this information to optimize your experience. If you do not wish to share files embedded with your geo-location information with us, please do not upload them. If you don’t want to store location data in your photos or videos, please consult the documentation for your camera to turn off that feature. Also, some of the information we collect from a Device, for example IP address, can sometimes be used to approximate a Device’s location.
https://www.dropbox.com/privacy
Location Exposure: Cost-Benefit Tradeoff
Privacy involves trade-offs Having your location known can be
beneficial Location-based services (discounts
can be “pushed”) Location-based social interactions
enabled (Foursquare, etc.) Opt-in to derive benefit
http://www.gpsworld.com/facebook-to-roll-out-location-app/
Location Services Required by Law Cell phones are ubiquitous, but
economic divide exists Carriers must implement E911
requirements of the Wireless Communications and Public Safety Act of 1999
Provide address to emergency responders when mobile phone users dial 911
Android Location Services
Periodically checks location using, for example, GPS, cell-towers, and Wi-Fi locations
Android phone sends publicly broadcast Wi-Fi access points' service set identifier (SSID) and Media Access Control (MAC) data
My phone and my laptop both “know” where I am (and so does Google and Verizon and…)
What kinds of locational information are available from social media?
GPS coordinates Place names Postal addresses (partial and
complete) Cell phone towers Wi-Fi hot spot locationsSome of these are more accessible than others, are better than
others for establishing location, and have different error profiles, but all are useful, especially when combined
How do you use this data to compromise privacy?
Raw coordinates and raw addresses typically insufficient
Must be linked with other information to intrude into “private” matters
Geocoding and inverse Geography is key link
Source: Tobler’s Web Sitehttp://www.geog.ucsb.edu/~tobler/presentations/shows/Explore_files/v3_document.htm
The Power of Linking
in 2000, Latanya Sweeney showed that 87 percent of all Americans could be uniquely identified using only three items of information: ZIP™ code, birthdate, and sex
ZIP code is but one geographical identifier that can be used to establish linkages or sieve out information to establish individual identities
Must get the geographical identifier, however
Geospatial Transforms Link Geocode: input address –> output
address with linked coordinate Inverse geocode: input coordinate –>
output address with linked coordinate
Then link address to multitude of info and use coordinates to create maps
Inverse Geocoding
Given a list of coordinates, or a map that contains only “anonymous dots”, individual-level information can be recovered
Assumption is unmasked data… Also assume geographic base file
(TIGER available for entire US from Census Bureau) that has digital street network and address ranges for street segments
Geocode 100 Addresses from White Pages
Inverse Geocoding of 100 Addresses— 94% Successful
Maps from Text
Wang and Stewart combine a geospatial ontology with a gazetteer to create maps from text (web news stories)
Ontology in this case refers to natural hazards (e.g., tornados)
Gazetteer matches named place to coordinates
Same process applies to social mediaWang, W., and Stewart, K. In press. Creating spatiotemporal semantic maps from web text documents. In M-P. Kwan, D. Richardson, D. Wang, and C. Zhou, Space-Time Integration in Geography and GIScience: Research Frontiers in the US and China. Dordrecht: Springer.
Sources of Error and Uncertainty
Many different kinds When place names (or establishment
names) define location, significant ambiguities occur (e.g., cities without states) and the potential for location and semantic error is large
Natural language processing remains a difficult problem, as we have seen
Name uncertainty
Colloquial expressions of place “I went to Johns and got a case.” (sic,
sans possessive) John has several connotations, and a
case may refer to an infectious disease
Or it may refer to a person named John who provided a case of something unspecified
Or it may refer to…
John’s Grocery in Iowa City
Ten percent discount on all cases of wine!
Source: http://www.panoramio.com/photo/64624209
What conceptual model might we use to examine geo-privacy?
Personal activity is a space-time process
Birth (Start) – Movement – Death (End)
Time Geography? Activity Space?
Constructing and inferring activity spaces Collection of locations (and the paths
between them) that an individual has direct contact with on a daily, weekly, or other cycle
Can be used to construct behavioral profiles Journey to work Day care location Place of worship Social clubs Et cetera
Persistence of Paths
People tend to develop well-worn paths for routine travel (journey to work) based on accumulated experience and a common goal of minimizing time or cost
Alterations occur based on different time of day, multi-purpose trips and either temporary (construction) or permanent (new road) route alterations
Activity Spaces and Paths
HOME
Day Care
Work
Groceries
Time
Space
Space-Time Aquarium “Surveillance in a Box”
After: Torsten Hägerstrand
Regular meeting at IHOP could reveal a problem with maple syrup dependency
IHOP
Defense? Go off the grid.
Big data & personal privacy: antithetical
Scott McNealy: “Get over it” quote
http://www.technologyreview.com/news/514351/has-big-data-made-anonymity-impossible/
Is Geo-Privacy Moot?
Most people (particularly those “born digital”) will have location known continuously as we become increasingly cyborg-like
Augmented reality (Google Glass, is tip of iceberg) will require location to be effective
Intelligent transport (vehicle-to-vehicle and vehicle-to-infrastructure) requires continuous, high- accuracy locations
The only people who opt-out will be criminals?
The End
Offset parameter moves geocoded location off the centerline to a “plausible” (approx) location on the correct side of the street (e.g., 10 - 15 meters)
Squeeze % compression factor that moves locations inward on block face to ensure they are on correct street
Centerline“where people do not live”
Offset and Squeeze
Locational Cloaking Possible
Newer work uses a “donut” mask to suppress displacement at original point
Contextually Adaptive Mask
Replace a coordinate with zero, one or two dimensional object (e.g., a point could be assigned to a transport link)
Mask size is a function of local population density or other factors (context) important to preservation of geo-privacy
Attribute Masking
Knowing an attribute value can reveal location in some cases
This information can then be linked to access other types of personal-level information
Attributes may require masking as a consequence
1020
30
Residence
A measured value of z = 30 gives a guess about location if you know the model and parameters
Value
<10 10-20
20-30
>30
n 18 3 2 1
http://www.informationweek.com/social-business/social_networking_consumer/how-to-declare-independence-from-bad-soc/240157775?cid=NL_IWK_Daily_240157775&elq=f46e26961f754fd6b0a6dae1c8289575
"This is where we are at," she wrote in her initial post. "Where you have no expectation of privacy. Where trying to learn how to cook some lentils could possibly land you on a watch list. Where you have to watch every little thing you do because someone else is watching every little thing you do." . http://www.informationweek.com/security/privacy/pressure-cooker-flap-traces-to-employer/240159335?cid=NL_IWK_Daily_240159335&elq=60d25d67a58d4560b1fb8e07828f7bdd
Laws in the Works - PushbackGPS Act Location Privacy Protection
Act
Government must show probable cause to acquire location information
Applies to real-time tracking of person's current and past movements
Prohibits commercial service providers from sharing location data
Similar in intent to GPS Act
Al Franken (D-MN) Passed in Senate
Judiciary Committee, December 2012
User consent required before location data acquired
Privacy vs. Safety
Privacy goes out the window when there is an overriding concern about public safety
Megan’s Laws “out” sex offenders and locate their residences
Public good outweighs individual privacy considerations?
Search centered on my house (r=2 mi) yields n=5
http://www.iowasexoffender.com/
Inverse Geocoding Steps (TIGER)
Determine projection & coordinates of source data; transform if necessary
Find “closest” street segment and snap to it (not always easy due to ambiguity, e.g. a point close to an intersection)
A matter of proportion: Calculate proportionate distance of point along segment and use that proportion to determine address range proportion for the street segment from TIGER address range (with appropriate parity [L-R] check)
Find “closest” address and return it (errors occur) Use address to link with other personal information
(telephone, et cetera) In places with cadastral (parcel) information systems, can do
point-in-polygon to get address from a coordinate
Error in Inverse Geocoding Offset and squeeze during geocoding Other error sources
GPS error (e.g., side of street switch) Geographic base file error (e.g., address
range)