Upload
kartik-rishi
View
224
Download
0
Embed Size (px)
Citation preview
7/28/2019 SSO Project Paper
1/28
Centralizing the Decentralized: TheValue Implications of Single Sign-onServices
Abstract
The nature of the internet is that its decentralized
nature is the greatest strength to its continued growthand value; however a trend that is developing is the
use of points of authority that house your identity and
interface with other web services to authenticate your
identity. This industry falls under the title of Single
Sign-On (SSO) services that allow you to log in on
many different sites. We take a look at major SSO
integrators and see how they utilize SSO to provide
value to users and see how they benefit from having
that system in place. We also take a look at the data-
use policies of SSO providers to understand how the
industry in general treat users and their data. After that
we follow-up with a study on the usage of SSOs
through the lens of actual users and by combining all
this data we develop a best practices for users to help
them be more informed on how their data is used and
how they can service their own personal values and
interests.
Keywords
VSD, Single Sign-on Services, SSO, Values, SSO
Integrators, SSO Providers, data, services, privacy,
best practices
Copyright is held by the author/owner(s).
INFO 444, Autumn 2012
School of Information
University of Washington
Kartik Rishi
School of Information
Informatics - HCI
Scott Kuehnert
School of Information
Informatics - HCI
Teresa Lam
School of Information
Informatics - HCI
Augustus Yuan
School of Information
Informatics - HCI
7/28/2019 SSO Project Paper
2/28
2
Introduction
The internet as we know it is growing at an incrediblepace, and with it, new services are popping up
everywhere with a new solution to any and all of our
old problems. Need to shop for clothes online? There is
a website for that. Want to listen to a variety of music?
There is a web application for that too! Are you
interested in having a discussion with your friends and
family? You bet there is a way to do it online! With the
expanding role of the internet in our day-to-day lives,
we develop manifestations of ourselves throughout the
internet via user accounts tagged to emails that you
may not even remember the passwords for! Wouldnt it
be nice if all you had to remember was one account,one email, one password?
The premise behind a Single Sign-On (SSO) service is
that a user only has to establish their account in one
place and is able to utilize it on many other sites! The
user no longer has to provide different credentials for
different sites, to ultimately establish a connection to
their identity on that site and to access the service that
it provides. In this day and age where users provide so
much information about themselves, a single site can
develop a significant idea of who the user is, and in
that process becoming an SSO Provider, where the new
service is that they can establish the users unique
identity anywhere. For those that actually implement
the other side of the relationship, SSO Integrators, are
sites that offer a service that the user wants and will
communicate with SSO Providers to provide a
convenient authentication for who the user is and let
them continue on with what they intended to do.
What does this present to the user in terms of benefits?
The user is now able to consolidate their various user
accounts in to one convenient account that allows
access to various services. On top of that, because theirinformation is shared, their preferences and trends
carry over, making the services that SSO Integrators
provide very personal to the user. To develop on the
personality of services, Integrators can also utilize
geographic and friend data to provide content that is
dynamic and far more relevant to your immediate
location and your friends. SSO also presents the
opportunity for users to have various SSO Integrators
work with each other to improve the level of service
provided, simply because the user has a global
identity shared among all of them.
Our research began with a story about a man named
Bogomil Shopov, an online IT marketing and
community management professional from Bulgaria.
This individual was able to purchase 1,500,000 entries
of first and last name, email, and private Facebook
profile IDs for $5 USD
(http://talkweb.eu/openweb/1819). Thats five bucks,
straight and simple. This brings us to the negative side
of SSO services and that is that while a users data may
be integrated with various other sites, what data is
truly transferred and what actually happens to it?
Our team intends to explore the various aspects behind
Single Sign-On services and an understanding of those
services can gain us insights in to the users that utilize
them. We will begin our study by determining some of
the direct and indirect stakeholders involved with SSO
services, to determine the key players and motivations
behind how these systems are setup. Following that we
will take a glimpse in to some well-established web
services that are SSO Integrators and how they utilize
data provided by SSO Providers to service users. From
http://talkweb.eu/openweb/1819http://talkweb.eu/openweb/1819http://talkweb.eu/openweb/1819http://talkweb.eu/openweb/18197/28/2019 SSO Project Paper
3/28
3
there we will expand to establish who the top three
SSO Providers are and then discuss each one in detailto understand how their system works and synthesizing
their data use policies. By establishing a profile and
understanding on the top three Providers, we intend to
compare and contrast their approach to SSO and come
to an understanding of what user values are implicated
by how those systems were developed. After some
insight in to common Integrators and Providers, we
intend to develop a common understanding of how
users approach SSO services in their day-to-day life to
establish a better idea of the relevance of the
technology and prevalence in daily life. Upon
completion of understanding a broad user base, weintend to analyze how common users utilize SSO
services and what that also means in terms of values
implicated.
Now you may be asking, whats our true purpose
behind all of this work? We hope to analyze both our
technical study of SSO services and an empirical
measurement of the penetration of SSO technology in
our peer groups and develop a strong understanding of
what values are truly at stake for users in this Provider-
Integrator-User relationship. Once we understand what
those values are we intend to develop a best practice
guideline that users can quickly read up on and
understand key aspects of SSO services and how they
can better protect themselves. With those guidelines,
users express more control on their information by
having more knowledge on its spread and can improve
their leverage in the Provider-Integrator-User
relationship.
Methodologies & Stakeholders
The basis of our work will be rooted in the principles ofValue Sensitive Design (VSD), a tripartite
methodology, consisting of iteratively applied
conceptual, empirical, and technical investigations; an
emphasis on considering indirect as well as direct
stakeholders (that is, people who are affected by a
technical system but dont use it directly, as well as
those who do); and an interactional theory of the
relationship between values and technology. (Borning)
To begin the direct stakeholders include the users and
providers of SSOs. The indirect stakeholders include
SSO integrators such as deal sites like Groupon and
LivingSocial, data-aggregation services, and marketingagencies.
The benefits for users are that they get to use one
service to sign into various different websites. This
saves them time from having to create a new account
each time they visit a new website. In addition, users
only have to memorize one username and password
rather than multiple ones which can get confusing at
times. They also benefit from personalized ads which
can be helpful for users. The harms for the users
include the possibility of third party websites obtaining
information from the user that they did not wish to
provide. Another harm is that SSO integrators have
permission to access all the information that you
provide in the social network which could be more than
what users want to provide to these sites.
As for the SSO integrators, they benefit by creating
more personalized ads targeted to users, which in turn
increases the likelihood of a user buying a product on
the site. They also perform analytics and conduct
customer research. The SSO benefits from users
7/28/2019 SSO Project Paper
4/28
4
continuing to use their social media website which
increases their traffic which means they can earn moremoney. Users benefit from simplicity, time efficiency,
and personalization. Conflicting value tensions include
lack of privacy and consent. SSO integrators benefit
from gaining valuable information while SSOs benefit
from popularity.
SSO Integrators
In this section, we will be investigating how certain
websites integrate Single Sign-On from social media
sites and use it to their advantage. Single Sign-On
services such as Facebook Connect can carry a lot of
data from a users Facebook account into the service.Data such as interests, gender, likes, and friends in a
users network become a lot more transparent for the
Integrator and while they may use this information for
the users gain, they may also use it for their gain as
well. For this reason, this section will look more deeply
into the privacy/data use policies stated by SSO
Integrators regarding the data they collect from users
and how they provide benefits in exchange.
One example of an information technology that makes
use of SSO specifically is StackExchange, a large
network comprised of 90 Q&A sites which are all linked
together. We were interested in StackExchange
because, despite having its own StackExchange account
that gives you access to all ninety sites, they also
integrate a variety of other social media sites to allow
you to connect with the different sites, including:
Yahoo MyOpenID LiveJournal
WordPress
Blogger Verisign ClaimID
ClickPass
Google-Profile
AOL
This brings up a lot of privacy issues to us as to how
much data StackExchange is collecting from all these
sites, and what they are using it for. Under
StackExchanges privacy policy, they state that they
will tell you how they are using the data and they will
make the notice in clear and conspicuous language
when you are asked to first provide [StackExchange]with personal information and that they will notify
[the user] before [StackExchange] uses the information
for something other than the purpose for which it was
originally collected. StackExchange uses this
information to their benefit, however, in ways they
have listed in their privacy policy such as allowing the
user to register to [StackExchange] websites, online
communities, and other services, communicate with
users effectively, and evaluate quality of their services.
StackExchange also uses this information to help
employers find or contact users who post profiles on
the Careers site, and transfer information to others as
described in this policy to satisfy our legal, regulatory,
compliance, or auditing requirements. In exchange,
the user gets access to many of the services
StackExchange offers including its huge network, all of
it being extremely accessible through one, simple, sign-
on.
Another example includes deal sites such as
Jackthreads, PLNDR, and Zappos whom focus primarily
on marketing clothes in general for very cheap deals.
7/28/2019 SSO Project Paper
5/28
5
They, too, have Single Sign-On services that allow
users to connect via Facebook, or other social mediawebsites. The major benefit they gain from this is they
get access to your social media profile and anything
you allow through their application. Here is an example
picture that is used by Groupon, a website focused on
delivering coupon deals to its users:
Groupons business model revolves around providing
deals for a diverse range of local activities (includingrestaurants, events, fitness, health, education, etc) to
their users. A majority of how Groupon profits from this
is they make deals with different businesses to
advertise so that those businesses can get more
customers. With so many businesses doing different
Figure 1: A prompt that informs
the user of all the data pieces that
Groupon requests from Facebook
during the Facebook Connect
session.
7/28/2019 SSO Project Paper
6/28
6
things and so many users, information about the users
is extremely helpful for Groupon.
Specifically, Groupon can use a variety of information
you make available publicly to target which coupons
they want to send to you. They use the information for
things such as maintaining the website, providing
personalized ads, evaluating you for certain offers, and
performing analytics for customer research. In their
privacy policy they state that if you want to limit the
information they obtain, you may manage the sharing
of certain Personal Information with [Groupon] when
you connect through social networking platforms or
applications and that adjusting permissions of thatpersonal information is dependent on the privacy policy
of the social networking platform. In this situation,
Single Sign-On has the main advantage of personalizing
the Groupon experience and has less of an emphasis on
convenience.
One final example of how Single Sign-On is utilized is in
Wolfram|Alpha, a computational knowledge engine that
uses your Facebook account to deliver very precise
analytics including habits, charts, graphs, and statistics.
Wolfram|Alpha mentions that the main purpose they
use the information is to help enhance and refine
[Wolframs] content and that information collected
about you through your experience and queries is used
to better understand the entire population that is
utilizing our website and how we might improve our
services to improve the collective experience. They
also make it explicitly clear that personally identifiable
information Wolfram|Alpha is allowed to access is
affected by the privacy settings you have established at
the TPS and that the linkage between any TPS and
Wolfram|Alpha is completely voluntary, and our ability
to access your information at the TPS requires that
linkage, you have a choice whether or not to disclosesuch information. It goes to show just how much
information Wolfram|Alpha has at its disposal and
many third party sites can potentially benefit from it.
The user also benefits from this because he/she can
gain knowledge of the different habits he/she exhibits
and can focus on fixing them if necessary. Some
snippets of Wolfram|Alpha analytics have been
included:
For example in the above image, you can see the users
activity during the week. We can see in the second
Figure 2: An example of a piece of
analytic that Wolfram|Alpha
develops off of Facebook data.
7/28/2019 SSO Project Paper
7/28
7
graph that there is a lot of time spent on Facebook
around 2-3 AM on Friday morning. We can also see alarge variety of application usage in the first graph.
Wolfram|Alpha also makes the information very
accessible for the user by providing different ways to
download the data. They also have a way to monetize
by allowing users to obtain RAW data from
Wolfram|Alpha if users purchase the Pro plan.
In the terms of use, Wolfram|Alpha explicitly states
that they will not attempt to associate individual
Wolfram|Alpha inputs with individual human users, and
will not release individual or aggregated lists of inputs,
or any personally identifiable information, to any third
party, except in response to lawful court orders. We will
not attempt to assert intellectual property rights over
anything given as input to Wolfram|Alpha simply on the
basis of its having been given to us as input. However,
generating content through Wolfram|Alpha, the user isagreeing that Wolfram|Alpha can store [users data] in
log files, and use [user data] to generate the results.
Overall, we can see that different SSO integrators go
about using data collected through Single Sign-On in
various ways. In StackExchange, a majority of the use
is for convenience for the user with so many different
Q&A sites, having one account that allows you to
access all of them is extremely convenient and it makes
all the Q&A sites easily accessible. In Groupon, the data
collected through Single Sign-On is used to create a
very personalized experience for the user, and target
specific coupons based on the users data. Finally,
Wolfram|Alpha makes collected data very accessible to
the user, and also uses the data to better their own
Figure 3: The prompt that shows
that you can download analytic
information if you subscribe to
Wolfram|Alpha Pro
7/28/2019 SSO Project Paper
8/28
8
website or search engine. All SSO integrators have
explicitly stated somewhere in their privacy policy thatthey will not openly reveal user data to third parties
unless they are required to by court order primarily use
the data for the convenience of the user. Next we take
a look in to how users approach the use of SSOs by a
detailed empirical investigation.
Empirical Investigation on SSO Users
We decided to do a survey for our empirical
investigation to gain insight on what users felt when it
came to their privacy online. We wanted to determine
whether users actually cared about the privacy of their
information online or not. In addition, we would like tosee to what extent are the participants willing to give
up their privacy for other values.
Procedure
For our study, we gave our participants a survey that
consisted of 23 questions. These questions asked them
for information about their demographics, Single Sign-
On services, how much time they spent on the
computer and Internet, as well as privacy and security
related questions. We put our survey up online at
various websites including Amazon Mechanical Turk
(Amazon Mechanical Turk, 2005) and Reddit. The
majority of our responses came from Amazon
Mechanical Turk, which essentially is a paid
crowdsource service that connects companies to a large
body of people willing to do small tasks for a small sum
of money. These tasks are typically those that are
difficult for computers to accomplish while easy for
humans due to the difference in comprehension. This
platform is great to access a large variety of
individuals.
Participants
We had a total of 170 participants for our survey. Ofthe 170 participants, only 142 had valid responses to
the survey questions. We analyzed and based our study
on the 142 responses. Of the 142 people that took our
survey, 48 were female and 94 were male. This means
that about a third of our data were made up of females
and two thirds of our data were made up of males. We
had a wide variety of age groups take our survey.
Around 60% of the survey responders ranged from
ages 22-30. As for location, about 82% of our data
came from India.
Demographics
Age n %
18-21 17 12%
22-25 43 30.3%
26-30 41 28.9%
31-40 28 19.7%
41-50 9 6.3%
51-60 3 2.1%
61-70 1 0.7%
Gender n %
Male 94 66.2%
Female 48 33.8%
Country n %
India 116 81.7%
USA 17 12%
Pakistan 2 1.4%
Other 7 4.9%
Figure 4: A series of tables
displaying the demographic
information for the empirical
investigation survey we conducted.
7/28/2019 SSO Project Paper
9/28
9
Results
One of the questions that we asked in our survey wasWhy do you use Single Sign-On services? and we had
a lot of consistent answers from our participants. A
male around the age of 26-30 from India responded to
the question by saying Its easy and convenient.
Another response from a male thats also around the
age of 26-30 states It provides security as one time
login and logout. Also [theres] no need to remember
all the passwords every time. We used a website
called Many Eyes (Many Eyes, 2007) which is a
graphical tool that uses techniques to create a
graphical network representation of patterns of
reference in collaborative discourse(Wikipedia, 2011).One of the options on this site is a graphical
representation called a tag cloud which counts the
frequency of the words within our data. Below is a tag
cloud for the responses to the question Why do you
use Single Sign-On services? The word easy is the
biggest word which means that it is the most frequent
word response.
One of the relationships that we explored was the
usage of SSOs vs. Privacy violated in the future. Weasked the participants to answer the question:
Do you ever worry that your privacy might be
violated in the future?1 Please mark the scale
from 1-5:
1- Not Worried At All
2- Somewhat Worried
3- Neutral
4- Worried
5- Extremely Worried
Of the 142 participants, 47 responded 1 Not WorriedAt All. From the 47 respondents, 32 use SSO services.
This means that 68.1% use SSOs and are not worried
about their privacy being violated in the future. As for
those who answered a 5 Extremely Worried, 10 out
of the 19 use SSOs which is a 52.6%. There is a 15.5%
difference between those who answered a 1 and
those who answered a 5 that use SSOs. This means
that it is worth noting that out of the participants who
1: While this question may seem
quite broad, the context around it is
a series of questions related to
internet usage and SSOs, so there
exists some implicit framing to the
question
Figure 5: This is a word cloud
comprised of user responses to the
question, Why do you use Single
Sign-On services? The larger the
word the more frequent that
response occurred in survey
responses.
7/28/2019 SSO Project Paper
10/28
10
use Single Sign-On services, there are more
participants who are not worried about their privacybeing violated in the future as opposed to being worried
about their privacy being violated.
Another relationship that we explored was the usage of
SSOs vs. Privacy violated in the past. Of the 142
sample population, 11 participants said yes at having
their privacy violated in the past. Of the 11
participants, 5 said they used SSO, which is 45.5%.
Unfortunately we were not able to determine whether
SSOs played a part of violating the participants privacy
in the past or not since about half of the users that had
their privacy violated in the past used SSOs and theother half did not.
We also looked at the relationship between the users
who had their privacy violated in the past and whether
that affected whether they worried about having their
privacy violated in the future. Only 11 out of the 142
participants actually had their privacy violated in the
past. 72.7% of the 11 participants answered either a 5
Extremely Worried or a 4 Worried for their
privacy being violated in the future. This shows that
people who had their privacy violated in the past are
more concerned about their future privacy. This makes
sense because normally people who had a bad
experience in the past would end up being more
worried and cautious in the future.
Limitations
We had some limitations because some of our
questions could have been too broad or ambiguous for
the user. For example, we did not specify the question
Do you ever worry that your privacy might be violated
in the future? to just online. We did believe that users
could determine that is was for online because of the
wording and flow of our previous questions, but itspossible that not everyone understood it to mean just
online. Also, our data was limited to participants from
India which can provide different answers than users
from the US because of cultural differences.
Future Work
In the future, we would try and have more females
take the survey to get a 50/50 male and female ratio.
In addition, the majority of the sample for our survey
was from India, but for the future we would like the
majority to be from the USA for consistency. We would
also ask more detailed questions to get richer datafrom our participants as well. Some of our wording
from our survey could be asked in a better way for the
future as well.
Now that we have set up the foundation of knowledge
in both the SSO Integrators and the SSO Users, we
have an idea of the essential front-end ofthis
industry. Next we take an in-depth look in to the back-
end or in other words a look in to how SSOs work in
the provider perspective.
SSO Providers
A central goal of this research is to be useful to users of
Single Sign-On (SSO) services for making decisions
about what data they share and with whom. In order to
get an overview of the abilities of SSO providers with
respect to data usage, identify problem areas for users,
and draft best practices for users to follow when
deciding whether or not to use an SSO service, we have
performed analyses of the data use policies of each of
large SSO providers. This analysis forms the core of our
technical investigation for this project.
7/28/2019 SSO Project Paper
11/28
11
We observed that data use policies tend to be hard to
read because of a variety of factors including their size,the vocabulary used in them, and their overall
complexity. So, part of the motivation for this research
was to expose details of those policies in a way thats
easy to understand for users of those services.
Another reason we performed this analysis was to
inform the creation of best practices for users to follow
when deciding whether or not to use a Single Sign-On
service.
Methods
We began this portion of our research by brainstormingsome ideas. Before beginning our formal research, we
sketched out a few questions we had pertaining to the
data use policies of SSO providers. These included
questions such as How and when do SSO providers
collect user data? How and when do they share user
data? What sorts of control do users have over the
sharing and collection of their data?
We then gathered the data use policies of the top three
Single Sign-On providers across the web: Facebook,
Google, and Twitter. (Gigya)
The next task was to come up with a list of categories
to classify sections of the data use policies into that we
consider to be potentially of interest to users. Our
research and early brainstorms guided the creation of
high level categories such as allows for collection of
user data and allows for sharing of user data. We
used those high level categories in a first-pass reading
of the data use policies for each major SSO provider in
which we identified general regions of text that relate
to the high level categories. Then, we used the insights
gained from the first readings to produce a more
detailed list of allowances that may be of interest tousers. The word allowance is used to refer to
practices that are allowed by a companys data use
policy.
The list includes abilities that we consider to be
concerning, reassuring, or neutral (good, bad, or
neither for the user). Concerning in this case means
potentially causing harm to users. Reassuring in this
case means potentially protecting users from harm.
Harm is defined to be any occurrence that is
detrimental to a valued quantity (such as physical
health, income, reputation, mobility, etc). Groups likehttp://knowprivacy.org and
http://www.privacychoice.org/ served as inspiration for
our policy analysis, and some of the practices on the
list (such as Allows users to delete data and Notifies
users when government requests access to their data)
came from those websites. (Know Privacy,
PrivacyChoice) The final list of practices of interest can
be seen in the appendix under item Appendix A. The
list is broken into data collection, data sharing, ad
targeting, user control, and SSO. However, the
majority of the allowances on the list are related to the
first two categories, data collection and sharing,
because we are primarily concerned with the values of
privacy and security. User control and SSO refer to the
value of informed consent.
During a second read-through, we tagged specific
clauses in the data use policies that relate to
allowances on the list with numbers such as [1],
[2], and [3], and placed the tags within a table
next to the allowance they relate to. The final result is a
table that shows each instance of a given allowance
7/28/2019 SSO Project Paper
12/28
12
within each data use policy, and the clauses that relate
to that allowance. For example, the cell for the
intersection of the allowance allows collection of IP
address and the SSO provider Google may contain
[3], [4], [7] indicating that clauses marked [3],
[4], and [7] in Googles data use policy relate to the
given allowance.
It is important to note that the number of times an
allowance occurs in a data use policy does not
necessarily reflect the degree to which a company
performs a given action. Its tempting to see the
quantity of references as an indicator of a companys
actions in that area. Instead, its more useful to think
about the number of references as a measure of the
number of ways a company may possibly allow for a
given practice. Just because an allowance exists doesnt
mean they use it for example, a company may
reserve the ability to share personal data with
governments who request it, but never exercise that
ability on a users account.
In the case that a data use policy mentioned the
collection or sharing of Basic, Personal, or
Sensitive information, the meaning of the words in
the context of the particular policy was parsed as
necessary for entry into the table. For example, the
definition of Basic information in the Facebook data
use policy is described as: basic info includes your
User ID, as well your friends' User IDs (or your friend
list) and your public information. (Facebook) All
clauses referencing the collection of basic info were
broken into the categories that relate to user Id, friend
information, and public information.
ResultsThe results show some interesting trends. The first is
that Google and Facebook have somewhat inverse
priorities in their data use policies. Facebook is more
oriented on the sharing of data than collection of data,
whereas Googles data use policy referenced the
collection of data more than the sharing of data.
Googles data use policy has many references to the
types of data the company may collect from users and
when, but the policy only mentions the sharing of that
information with third parties in a few limited
circumstances. On the other hand, Facebooks data use
policy allows for the sharing of data in multiple places,
and only discusses data collection a handful of times in
the beginning of the data use policy. This relationship
can be seen in the following stack histogram:
This stack histogram shows the number of occurrences
of clauses within each privacy policy that allow for a
practice on our list of allowances (appendix item
Appendix A). The blue bars represent data from
Facebooks data use policy, the red bars represent data
from Googles data use policy, and the green bars
Figure 6: A visual example of the
codifying of an existing data use
policy and how it fits in the
categories we established.
7/28/2019 SSO Project Paper
13/28
13
represent Twitters data use policy data. The x-axis ishidden, but each bin is an allowance from the list, in
the same order as presented in the appendix under
item Appendix A. The full histogram can be viewed in
the index under item Appendix C.Since the list was
broken into Data collection, Data sharing, Ad targeting,
User control, and SSO, the histogram was drawn in
clusters, representing data collection, data sharing, and
user control and consent, indicated by magenta, cyan,
and yellow regions respectively.The complementary focuses of Facebook and Googles
data use policies is evident in the distribution of values
near the beginning and the end of the histogram.
Facebook is blue, and Google is red. Notice how Google
has more values near the beginning (where the bins
represent data-collection allowances), and Facebook
has more values near the middle and end (where the
bins represent data-sharing allowances). The circled
blue bar on the far left of the graph represents the
category for Allows collection of data generated
on/with the website (such as game characters, scores,
application usage etc). Since Facebook is a service
that largely revolves around content generation anduse of third-party applications, its unsurprising that
there are many places in the data use policy for
Facebook that refer to the ability to collect data
generated with use of the website.
The second trend is that the policies appear to focus
more on the companies abilities rather than on the
users abilities. This is evident in the large proportion of
concerning practices over reassuring practices that
each companys policy allows for. The reassuring
practices largely reflect users abilities, such as the
ability to delete their data or the ability to opt in or out
of a data collection/sharing, whereas the concerning
practices largely reflect abilities of the services, such as
the ability to collect data or share it with third parties.
The following pie charts illustrate the ratio of
concerning policies, neutral, and reassuring policies
contained within the data use policies of the top three
SSO providers:
Figure 8: A display of those
allowances compared over the
three SSO Providers and grouped
according to the buckets they fall
under. An expanded version is
available in the appendix.
7/28/2019 SSO Project Paper
14/28
14
Another finding is that none of the top three SSO
providers data use policies mention two allowances
that we deemed to be of interest to users. Those
allowances were Allows sharing of data that third
parties share with the provider about you with third
parties and Notifies users when government requests
access to their data. Those categories were inspired by
the privacy policy analysis tools on privacychoice.org.
Figure 9: A series of pie charts
showing the breakdown of
concerning neutral andreassuring allowances in their
respective policies.
Figure 10: A zoom on the
allowance chart bringing focus to
the lack of any policy addressing
those categories.
7/28/2019 SSO Project Paper
15/28
15
One final observation that stuck out was:
The only SSO provider in the top three to mention
single sign on services explicitly in their data use policy
was Facebook. Google and Twitter may have clauses
that apply generally enough to cover Single Sign-On
usage, but they never directly address SSO in their
data use policies.
Conclusion
This form of analysis has its advantages and
disadvantages.
One of the most important cons with our approach is
that gives us no insight into how data is actually used
by these companies, simply how data may be used.
Viewers of the results may be misled into thinking
companies with higher scores associated with a certain
allowance engage more in the allowed activity.
The upshot is that very different documents may be
directly compared with a common metric. This is
potentially very helpful for anyone interested inunderstanding and comparing data use policies, and
this gives us a framework to aggregate and compare
data pertaining to many companies at once. The data is
quantitative; so many quantitative analysis techniques
can be used to tease results out of the data. For
example, we can look for correlation between the
presence of one type of allowance and the presence of
another type of allowance within the policies if we had
enough data to perform the statistics confidently.
An issue with our study in particular is that we missed
some allowances that users may be interested in. For
example, the length of time companys hold on to user
data before deleting it is of concern for some people,
but we did not cover it. Other researchers may want to
look into holes in our allowances, find out how
confidently we can equate terms across privacy
policies, and investigate whether or not there is a
correlation between the number of times a privacy
policy mentions an ability and the number of ways that
ability is used in practice.
Figure 11: Another observation of
a lack of a service addressing a
specific category.
7/28/2019 SSO Project Paper
16/28
16
Best Practices & Conclusion
At this point we have had an in-depth look on the three
main aspects of SSO services: Integrators, Users and
Providers. We have establish an understanding of how
SSO Integrators utilize SSO systems in a practical
manner to provide better services for users while
seeing how the treat data of the users. After an
interesting survey we have determined some more
information on the prevalence of SSOs in a typical
users life and their views on how their values such as
privacy are treated. Finally we had a power look in to
the way SSO providers approach their services and how
data from users are treated. While the information
provided can be used to develop critical thoughts onvarious aspects of SSOs and even internet usage, or
original goal was to service users by informing them of
how they can better serve themselves when dealing
with their data online rooted in our research in SSO
services.
BE MINDFUL ON THE VALUE OFYOU
We cannot stress enough the value that an individual
has and in particular their identifying information. A
trend that we have come to see is that individuals tend
to not value their privacy and security of data until
something that harms those values. We urge that
users take their identity online seriously to avoid leaks
on their data to undesired third-parties.
STAY UP-TO-DATE
During our investigation we experienced a change on
the privacy and data-use policies of Facebook, one of
our SSO Providers. While we adjusted our work we
realized that it is incredibly important for users to stay
on top of changes to the privacy- and data use- policies
they engage in. While the changes that we encountered
for Facebook were minimal, its very easy for services
to change their stance quickly. If anything is apparent
by the data we harvested and more so the purpose of
this paper, these services dont work to inform users
on the details of their policies.
MANAGE THE ACCESS TO YOUR INFORMATION
Over extended periods of time, a user is likely to
establish many different connections between their SSO
Providers and various SSO Integrators. While some
may be valuable to the user and their day-to-day life,
others arent necessary for the user to maintain
connection with. We advise that for those services that
are used less often, its useful to disconnect or shutdown accounts so that those services no longer have
active access to your data and you have one less
service to manage.
EVALUATE THE VALUE OF SERVICES USED
We investigated how three well-known services utilized
SSO systems and what they provided in terms of value
to users. While these services do indeed offer great
value and protection of user data, that is not the case
with others. Therefore we advise that you take time to
evaluate on your own whether or not a service you
intend to sign-up with provides the right value for you
and how they handle your information by reading their
policies. In addition, you can take some time to do
some investigation online in to possible violations those
services have had in regards to user data.
MANAGE DIFFERENT KINDS OF DATA
Although this should be fairly straightforward, its
something that should always be kept in mind. The
purpose of this best practice is for you to keep in mind
what kinds of information you have available and to
7/28/2019 SSO Project Paper
17/28
17
whom. While information like your personal email and
your name may not be that important or potentially
insecure, having your address or social security
information shared around can be quite detrimental to
user identity security. Conduct a self-audit of what
information you can find about yourself that can be
harmful and work towards eliminating that data from
the internet as best you can.
While our investigations can be picked through for
further conclusions we believe that we have established
a fair foundation for informing users on quite a few
aspects of their online identities. As points of authority
on the internet grow further and start showing up inother aspects of our lives, the importance for an
informed user is paramount to the overall security of
individuals on the internet.
References
"Facebook Data Use Policy." Facebook. N.p., 08 June
2012. Web. 06 Dec. 2012.
"Groupon: Privacy Statement." Privacy Statement.
Groupon, 13 Sept. 2012. Web. 05 Dec. 2012.
.
"Which Identities Are We Using to Sign in Around the
Web?" Gigya. N.p., n.d. Web. 06 Dec. 2012.
Stack Exchange, Inc. Official Privacy Policy. Stack
Exchange, Inc., 28 June 2012. Web. 5 Dec. 2012.
.
"Which Identities Are We Using to Sign in Around the
Web?" Gigya. N.p., n.d. Web. 06 Dec. 2012.
"Wolfram|Alpha Privacy Policy." Wolfram|Alpha.
Wolfram|Alpha, 5 Mar. 2009. Web. 5 Dec. 2012.
.
"Your Privacy. Simplified." PrivacyChoice. N.p., n.d.
Web. 06 Dec. 2012
http://www.privacychoice.org/http://www.privacychoice.org/7/28/2019 SSO Project Paper
18/28
18
Appendix
Appendix A
This is an expanded list of the categories that we compared the SSO Providers with each other. They are listed by the
bucket the fall in to and are codified by whether they are Concerning Reassuring or Neutral.
Concerning
Reassuring
Neutral
Data Collection
Allows collection of personally identifiable information (name, birthday, address, phone, email,
gender)
Allows collection of information about contacts/friends
Allows collection of information others have shared about you
Allows collection of profile information (such as user ID, personal description, likes, interests, etc)
Allows collection of IP address
Allows collection of location data
Allows collection of data generated on/with the website (such as game characters, scores, application
usage etc)
Allows collection of browsing history/health history/religion/political orientation (Potentiallysensitive information)
Allows collection of uploaded media (images, video, text, etc)
Allows collection of data that third parties share with the provider about you
Data Sharing
Allows sharing of personally identifiable information (name, birthday, address, phone, email, gender)
Allows sharing of information about contacts/friends
7/28/2019 SSO Project Paper
19/28
19
Allows sharing of information others have shared about you
Allows sharing of profile information (such as user ID, personal description, likes, interests, etc)
Allows sharing of IP address
Allows sharing of location data
Allows sharing of data generated on/with the website (such as game characters, scores, applicationusage etc)
Allows sharing of browsing history/health history/religion/political orientation (Potentially sensitiveinformation)
Allows sharing of uploaded media (images, video, text, etc)
Allows sharing of data that third parties share with the provider about you
Allows sharing of untagged data (unassociated with users profiles) with third parties with thirdparties
Requires that receivers of data follow certain guidelines/rules
Notifies users when government requests access to their data
Ad Targeting
Uses data to target users with advertisements (but does not share that data with advertisers)
User Control
Allows for opt-out and opt-in for data collection/sharing
Allows users to delete their data
SSO
Specifically mentions single sign on in the data use/privacy policy
Appendix B
The bracketed numbers in the provider columns represent locations in the corresponding privacypolicies (included with the appendix of this document with item numbers: _____) in which the specific
7/28/2019 SSO Project Paper
20/28
20
clauses related to a data collection practice exist. To view specific clauses, reference the privacy
policy and look for a number highlighted in yellow with the value of interest. The text that comes afterthe number is the clause referenced.
Allowances Provider 1 Provider 2 Provider 3
Data collection Facebook Google Twitter
Allows collection of personallyidentifiable information (name,birthday, address, phone, email,gender)
[1][13][16] [1][2][3][4][6] [1][3][7][12][13][15]
Allows collection of information aboutcontacts/friends
[3][15][24][25] [2][3] [11][13][15]
Allows collection of information othershave shared about you
[24] None None
Allows collection of profile information(such as user ID, personaldescription, likes, interests, etc)
[12] [1][2][3] [2][5][13] [16]
Allows collection of IP address [8] [3][4][7] [21]
Allows collection of location data [8][11] [2][9] [6][15][17][19]
Allows collection of data generatedon/with the website (such as gamecharacters, scores, application usageetc)
[2][3][6][7][9][11][19][24] [3][5][10] [14][15][16]
Allows collection of browsing None [3][5][10] [20]
7/28/2019 SSO Project Paper
21/28
21
history/health history/religion/ political
orientation (Potentially sensitiveinformation)
Allows collection of uploaded media(images, video, text, etc)
[5][14] [3][2] [7]
Allows collection of data that thirdparties share with the provider aboutyou
[10][37] [3] [13][22]
Data Sharing
Allows sharing of personally
identifiable information (name,birthday, address, phone, email,gender) with third parties
[13][19][29][30][36][44] [2][16] [9][25][26]
Allows sharing of information aboutcontacts/friends with third parties
[16][19][29][30][36][38][44]
None [15][25][26]
Allows other users to shareinformation about you with thirdparties
[24][32][33] None [15]
Allows sharing of profile information(such as user ID, personaldescription, likes, interests, etc) withthird parties
[15][19][26][29][30][36][38][41] [44]
[2] [4][9][22][25][26]
Allows sharing of IP address with thirdparties
[44] None [24][25][26]
Allows sharing of location data withthird parties
[19][28][29][30][44] None [15][17][19][25][26]
Allows sharing of data generatedon/with the website (such as gamecharacters, scores, application usageetc) with third parties
[16][29][30][44] [2] [9][14][15][25][26]
7/28/2019 SSO Project Paper
22/28
22
Allows sharing of browsing
history/health history/religion/politicalorientation (Potentially sensitiveinformation) with third parties
[29][30][44] [16] [25][26]
Allows sharing of uploaded media(images, video, text, etc) with thirdparties
[15][19][29][30][44] None [14][15][25][26]
Allows sharing of data that thirdparties share with the provider aboutyou with third parties
None None None
Allows sharing of untagged data(unassociated with users profiles) withthird parties with third parties
[12][42][45] [18] [29]
Requires that receivers of data followcertain guidelines/rules
[45][46] None [29]
Notifies users when governmentrequests access to their data
None None None
Advertisements
Targets users with specificadvertisements (but does not sharethat data with advertisers)
[4][43] [11] [20]
User Control
Allows for opt-out or opt-in for datacollection and sharing
[20][22][27][34][39] [12][13] [10][18][23]
Allows users to delete their data [21][23][31][40] [14][15] [30][31]
SSO
7/28/2019 SSO Project Paper
23/28
23
Specifically mentions single sign on in
the data use policy
[35][36] None None
7/28/2019 SSO Project Paper
24/28
24
Appendix C
7/28/2019 SSO Project Paper
25/28
25
Appendix D
SSO Survey for Users1) Are you male or female? Male Female
2) What age group do you fall under? 17 and under 18-21 22-25 26-30 31-40 41-50 51-60 61-70 71 and over
3) What country do you live in?______________________
4) How many hours a week do you spend on a computer?________
5) How many hours a week do you spend on the internet?________
6) What percent of the time do you use the internet for personal and business uses? (Your responses should sumto 100.)
______% Personal______% Business
7) Please estimate the number of hours you spend per week on the following services:Services Number of HoursEmail ________Facebook ________Twitter ________Google Account ________Other: __________ ________Other: __________ ________Other: __________ ________
8) What are your primary uses for the internet? Shopping Research
Communication
7/28/2019 SSO Project Paper
26/28
26
News Other (please list) __________________
9) Has your privacy ever been violated on the Internet? Yes No
10)If yes, please briefly describe the most recent time that your privacy was violated.___________________________________________________________________
11)Do you ever worry that your privacy might be violated in the future? Please mark the scale from 1-5. 1- Not worried at all 2- Somewhat worried 3- Neutral 4- Worried 5- Extremely worried
12)Please briefly describe a situation where your privacy might be violated online.
13)Are you familiar with Single Sign-On Services (SSO)? (For example: Facebook Connect or Google Accounts) Yes No
14)Do you use Single Sign-On services? Yes No
15)If yes, which SSO do you use? Facebook Connect
Google Account Twitter Other (please list) ___________________
16)If yes, why do you use Single Sign-On service?__________________________________________________________________________
17)Please describe how you think Single Sign-On services work.__________________________________________________________________________
18)Do you have any privacy or security concerns related to your use of Single Sign-On services? Yes No
7/28/2019 SSO Project Paper
27/28
27
19)Do you have multiple identities online? Yes
No
20)How many email addresses do you have?______
21)Do you typically link your payment/credit card information to your personal identity online? Yes No
22)How do you typically pay for things you purchase online? Direct credit card PayPal Google Wallet Other (please list) _____________
23)Where did you hear about this survey? Facebook Reddit Search Engine Other (please list) _____________
Appendix E
For access to other pertinent data points please reference:
Data Link
RAW Survey
Data
https://docs.google.com/spreadsheet/ccc?key=0AqKTq25pswcgdG1ZTldvRVE3TnNMdEg3M0IyamNNSlE
Policy
(Codified)
https://docs.google.com/open?id=0B6ANIPyq21eMcW5mclFmR3ZKb2M
Policy
(Codified)
https://docs.google.com/open?id=0B6ANIPyq21eMMVYxSVZ6MW10SEk
https://docs.google.com/spreadsheet/ccc?key=0AqKTq25pswcgdG1ZTldvRVE3TnNMdEg3M0IyamNNSlEhttps://docs.google.com/spreadsheet/ccc?key=0AqKTq25pswcgdG1ZTldvRVE3TnNMdEg3M0IyamNNSlEhttps://docs.google.com/open?id=0B6ANIPyq21eMcW5mclFmR3ZKb2Mhttps://docs.google.com/open?id=0B6ANIPyq21eMcW5mclFmR3ZKb2Mhttps://docs.google.com/open?id=0B6ANIPyq21eMMVYxSVZ6MW10SEkhttps://docs.google.com/open?id=0B6ANIPyq21eMMVYxSVZ6MW10SEkhttps://docs.google.com/open?id=0B6ANIPyq21eMMVYxSVZ6MW10SEkhttps://docs.google.com/open?id=0B6ANIPyq21eMcW5mclFmR3ZKb2Mhttps://docs.google.com/spreadsheet/ccc?key=0AqKTq25pswcgdG1ZTldvRVE3TnNMdEg3M0IyamNNSlE7/28/2019 SSO Project Paper
28/28
28
Policy
(Codified)
https://docs.google.com/open?id=0B6ANIPyq21eMR00xT2tfcFVid2c
https://docs.google.com/open?id=0B6ANIPyq21eMR00xT2tfcFVid2chttps://docs.google.com/open?id=0B6ANIPyq21eMR00xT2tfcFVid2chttps://docs.google.com/open?id=0B6ANIPyq21eMR00xT2tfcFVid2c