43
Geo-Privacy in Data-Rich Social Media Environments Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim Chair, Department of Cinema and Comparative Literature The University of Iowa (aka #1 party school in USA) Iowa City, IA [email protected]

Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Embed Size (px)

Citation preview

Page 1: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Geo-Privacy in Data-Rich Social Media Environments

Marc P. ArmstrongProfessor and Collegiate Fellow, Geographical and Sustainability SciencesInterim Chair, Department of Communication StudiesInterim Chair, Department of Cinema and Comparative LiteratureThe University of Iowa (aka #1 party school in USA)Iowa City, [email protected]

Page 2: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Privacy Preamble I

All humans need, and expect, some measure of privacy

Privacy is fluid and may change substantially depending on age, culture, technological innovations and other factors (e.g., telephone wire taps were once unregulated)

Page 3: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Privacy Preamble II

“Right to privacy” in US Constitution was limited to search, but has been expanded by Supreme Court and is covered in other laws (e.g., FERPA, HIPPA)

Privacy is a foundational tenet of the Universal Declaration of Human Rights (Article 12)

Most people are unaware of “locational power” of smart phones and social media (USA Today 080813: 98% of cell users receive push alerts by location)

Focus here is on geo-privacy

Page 4: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Geo-privacy

Being secure from unwanted locational observation or tracking

Amassed location data allows for the creation of a detailed profile of individual behavior, including habits, preferences, and routines—private information that could be exploited and cause harm

Where you go implies what you do and who you are

Page 5: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Effects of Exposure

Social media users expose themselves to geo-privacy risks

Disclosure of information to third parties Consumer tracking & surveillance Identity theft (and burglary) Threats to physical safety

Page 6: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Is this a real concern? Yes.

Page 7: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Every day it’s something new…

http://spectrum.ieee.org/riskfactor/computing/it/it-hiccups-of-the-week-programming-error-exposes-up-to-187-000-indiana-family-aid-recipients/?utm_source=computerwise&utm_medium=email&utm_campaign=071

Page 8: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Facebook’s Graph Search

http://spectrum.ieee.org/telecom/internet/the-making-of-facebooks-graph-search/?utm_source=techalert&utm_medium=email&utm_campaign=080813

Page 9: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Public vs. “Private” Sharing People using social media control who can

observe what information, kind of Some is fully public, other items only

accessible to “friends” or friend-like people

But privacy policies change without full user comprehension (EULA!!!) and expose data to public inspection (Facebook!)

And providers and their consorts (e.g., app developers) constantly scrape data in an attempt to monetize

Page 10: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

DropBox Weasel Word Privacy Geo-Location Information.   Some Devices allow applications

to access real-time location-based information (for example, GPS). Our mobile apps do not collect such information from your mobile device at any time while you download or use our mobile apps as of the date this policy went into effect, but may do so in the future with your consent to improve our Services. Some photos and videos you place in Dropbox may contain recorded location information. We may use this information to optimize your experience. If you do not wish to share files embedded with your geo-location information with us, please do not upload them. If you don’t want to store location data in your photos or videos, please consult the documentation for your camera to turn off that feature. Also, some of the information we collect from a Device, for example IP address, can sometimes be used to approximate a Device’s location.

https://www.dropbox.com/privacy

Page 11: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Location Exposure: Cost-Benefit Tradeoff

Privacy involves trade-offs Having your location known can be

beneficial Location-based services (discounts

can be “pushed”) Location-based social interactions

enabled (Foursquare, etc.) Opt-in to derive benefit

http://www.gpsworld.com/facebook-to-roll-out-location-app/

Page 12: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Location Services Required by Law Cell phones are ubiquitous, but

economic divide exists Carriers must implement E911

requirements of the Wireless Communications and Public Safety Act of 1999

Provide address to emergency responders when mobile phone users dial 911

Page 13: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Android Location Services

Periodically checks location using, for example, GPS, cell-towers, and Wi-Fi locations

Android phone sends publicly broadcast Wi-Fi access points' service set identifier (SSID) and Media Access Control (MAC) data

My phone and my laptop both “know” where I am (and so does Google and Verizon and…)

Page 14: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

What kinds of locational information are available from social media?

GPS coordinates Place names Postal addresses (partial and

complete) Cell phone towers Wi-Fi hot spot locationsSome of these are more accessible than others, are better than

others for establishing location, and have different error profiles, but all are useful, especially when combined

Page 15: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

How do you use this data to compromise privacy?

Raw coordinates and raw addresses typically insufficient

Must be linked with other information to intrude into “private” matters

Geocoding and inverse Geography is key link

Source: Tobler’s Web Sitehttp://www.geog.ucsb.edu/~tobler/presentations/shows/Explore_files/v3_document.htm

Page 16: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

The Power of Linking

in 2000, Latanya Sweeney showed that 87 percent of all Americans could be uniquely identified using only three items of information: ZIP™ code, birthdate, and sex

ZIP code is but one geographical identifier that can be used to establish linkages or sieve out information to establish individual identities

Must get the geographical identifier, however

Page 17: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Geospatial Transforms Link Geocode: input address –> output

address with linked coordinate Inverse geocode: input coordinate –>

output address with linked coordinate

Then link address to multitude of info and use coordinates to create maps

Page 18: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Inverse Geocoding

Given a list of coordinates, or a map that contains only “anonymous dots”, individual-level information can be recovered

Assumption is unmasked data… Also assume geographic base file

(TIGER available for entire US from Census Bureau) that has digital street network and address ranges for street segments

Page 19: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Geocode 100 Addresses from White Pages

Inverse Geocoding of 100 Addresses— 94% Successful

Page 20: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Maps from Text

Wang and Stewart combine a geospatial ontology with a gazetteer to create maps from text (web news stories)

Ontology in this case refers to natural hazards (e.g., tornados)

Gazetteer matches named place to coordinates

Same process applies to social mediaWang, W., and Stewart, K. In press. Creating spatiotemporal semantic maps from web text documents. In M-P. Kwan, D. Richardson, D. Wang, and C. Zhou, Space-Time Integration in Geography and GIScience: Research Frontiers in the US and China. Dordrecht: Springer.

Page 21: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Sources of Error and Uncertainty

Many different kinds When place names (or establishment

names) define location, significant ambiguities occur (e.g., cities without states) and the potential for location and semantic error is large

Natural language processing remains a difficult problem, as we have seen

Page 22: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Name uncertainty

Colloquial expressions of place “I went to Johns and got a case.” (sic,

sans possessive) John has several connotations, and a

case may refer to an infectious disease

Or it may refer to a person named John who provided a case of something unspecified

Or it may refer to…

Page 23: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

John’s Grocery in Iowa City

Ten percent discount on all cases of wine!

Source: http://www.panoramio.com/photo/64624209

Page 24: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

What conceptual model might we use to examine geo-privacy?

Personal activity is a space-time process

Birth (Start) – Movement – Death (End)

Time Geography? Activity Space?

Page 25: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Constructing and inferring activity spaces Collection of locations (and the paths

between them) that an individual has direct contact with on a daily, weekly, or other cycle

Can be used to construct behavioral profiles Journey to work Day care location Place of worship Social clubs Et cetera

Page 26: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Persistence of Paths

People tend to develop well-worn paths for routine travel (journey to work) based on accumulated experience and a common goal of minimizing time or cost

Alterations occur based on different time of day, multi-purpose trips and either temporary (construction) or permanent (new road) route alterations

Page 27: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Activity Spaces and Paths

HOME

Day Care

Work

Groceries

Page 28: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Time

Space

Space-Time Aquarium “Surveillance in a Box”

After: Torsten Hägerstrand

Regular meeting at IHOP could reveal a problem with maple syrup dependency

IHOP

Page 29: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Defense? Go off the grid.

Big data & personal privacy: antithetical

Scott McNealy: “Get over it” quote

http://www.technologyreview.com/news/514351/has-big-data-made-anonymity-impossible/

Page 30: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Is Geo-Privacy Moot?

Most people (particularly those “born digital”) will have location known continuously as we become increasingly cyborg-like

Augmented reality (Google Glass, is tip of iceberg) will require location to be effective

Intelligent transport (vehicle-to-vehicle and vehicle-to-infrastructure) requires continuous, high- accuracy locations

The only people who opt-out will be criminals?

Page 31: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

The End

Page 32: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim
Page 33: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Offset parameter moves geocoded location off the centerline to a “plausible” (approx) location on the correct side of the street (e.g., 10 - 15 meters)

Squeeze % compression factor that moves locations inward on block face to ensure they are on correct street

Centerline“where people do not live”

Offset and Squeeze

Page 34: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Locational Cloaking Possible

Newer work uses a “donut” mask to suppress displacement at original point

Page 35: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Contextually Adaptive Mask

Replace a coordinate with zero, one or two dimensional object (e.g., a point could be assigned to a transport link)

Mask size is a function of local population density or other factors (context) important to preservation of geo-privacy

Page 36: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Attribute Masking

Knowing an attribute value can reveal location in some cases

This information can then be linked to access other types of personal-level information

Attributes may require masking as a consequence

Page 37: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

1020

30

Residence

A measured value of z = 30 gives a guess about location if you know the model and parameters

Value

<10 10-20

20-30

>30

n 18 3 2 1

Page 38: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

http://www.informationweek.com/social-business/social_networking_consumer/how-to-declare-independence-from-bad-soc/240157775?cid=NL_IWK_Daily_240157775&elq=f46e26961f754fd6b0a6dae1c8289575

Page 39: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

"This is where we are at," she wrote in her initial post. "Where you have no expectation of privacy. Where trying to learn how to cook some lentils could possibly land you on a watch list. Where you have to watch every little thing you do because someone else is watching every little thing you do." . http://www.informationweek.com/security/privacy/pressure-cooker-flap-traces-to-employer/240159335?cid=NL_IWK_Daily_240159335&elq=60d25d67a58d4560b1fb8e07828f7bdd

Page 40: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Laws in the Works - PushbackGPS Act Location Privacy Protection

Act

Government must show probable cause to acquire location information

Applies to real-time tracking of person's current and past movements

Prohibits commercial service providers from sharing location data

Similar in intent to GPS Act

Al Franken (D-MN) Passed in Senate

Judiciary Committee, December 2012

User consent required before location data acquired

Page 41: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Privacy vs. Safety

Privacy goes out the window when there is an overriding concern about public safety

Megan’s Laws “out” sex offenders and locate their residences

Public good outweighs individual privacy considerations?

Search centered on my house (r=2 mi) yields n=5

http://www.iowasexoffender.com/

Page 42: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Inverse Geocoding Steps (TIGER)

Determine projection & coordinates of source data; transform if necessary

Find “closest” street segment and snap to it (not always easy due to ambiguity, e.g. a point close to an intersection)

A matter of proportion: Calculate proportionate distance of point along segment and use that proportion to determine address range proportion for the street segment from TIGER address range (with appropriate parity [L-R] check)

Find “closest” address and return it (errors occur) Use address to link with other personal information

(telephone, et cetera) In places with cadastral (parcel) information systems, can do

point-in-polygon to get address from a coordinate

Page 43: Marc P. Armstrong Professor and Collegiate Fellow, Geographical and Sustainability Sciences Interim Chair, Department of Communication Studies Interim

Error in Inverse Geocoding Offset and squeeze during geocoding Other error sources

GPS error (e.g., side of street switch) Geographic base file error (e.g., address

range)