Upload
mae-brenda-mckinney
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
1
Technical Issues Concerning The Use Of Personal Data
On The InternetBrian Kelly Email
UK Web Focus [email protected]
UKOLN URL
University of Bath http://www.ukoln.ac.uk/
Bath, BA2 7AY
UKOLN is funded by the British Library Research and Innovation Centre, the Joint Information Systems Committee of the Higher Education Funding Councils, as well as by project funding from the JISC’s Electronic Libraries Programme and the European Union. UKOLN also receives support from the University of Bath where it is based.
UKOLN is funded by the British Library Research and Innovation Centre, the Joint Information Systems Committee of the Higher Education Funding Councils, as well as by project funding from the JISC’s Electronic Libraries Programme and the European Union. UKOLN also receives support from the University of Bath where it is based.
2
Contents
About UK Web Focus
Personal Information and the Internet• End User issues• Information Provider issues• System Administrator issues• Management Issues
Solutions• Technical• Protocol developments• Organisational
Conclusions
3
About UK Web Focus
UK Web Focus:• Three year post funded by JISC• Provides advice and support to the UK HE
community on Web matters• Activities:
– Monitoring web developments– Talks and presentations (e.g. Technical Threats to
Copyright and IPR at Talisman seminar on Legal Risks on the Internet in January 1998)
– Represent JISC on World Wide Web Consortium (W3C)– Other related activities
4
Personal Data and The InternetWhat are the issues peculiar to the Internet?
• Junk email• Big brother• Searching
• Log files• Preventing
misuse• Liability• Central Policies vs.
Departmental action• Student use• Confidentiality
• Ease of use• Ability to reuse
dataEnd Users
Information Providers
Systems Managers
Management
What Else?
5
End User
What are the privacy implications for end users of Internet services:
• A user of email or Usenet News• A student who uses a public access
PC to access Web resources• A member of staff who uses a PC in
his office
6
Mailing Lists and Usenet
Mailing ListsMailbase, for example, provides search facilities for finding:
• Membership of lists• Details of postings
Usenet NewsUsenet News articles:
• Are archived• Can be searched
http://www.mailbase.ac.uk/search.html
http://www.altavista.digital.com/
7
Institutional Mailing Lists
• Many institutions use the HyperMail software to archive internal mailing lists
• Robot software can index these archives.
Using ACDC to search for "Brian Kelly" reveals contributions to mailing lists.
8
UK Directory Services
Many Universities run an X.500 directory service
• X.500 is a distributed directory protocol
• Originally dedicated clients were used to access X.500
• Now it's much easier using the Web
http://www.brunel.ac.uk/x500/search-form-gb.html
9
Finding People
Various other directory searching services are available:
Whowhere<URL: http://www.whowhere.com/>
BigFoot<URL: http://www.bigfoot.com/>
IAF (Internet Address Find)<URL: http://www.iaf.net/>
Advertising revenue can make these a commercial proposition
10
Ahoy!
Ahoy! is a research project which uses AI techniques to find (a small number of) personal home pages
AI techniques will make it easier to find personal information
http://www.ahoy.cs.washington.edu:6060/
11
Web Browsers and PrivacyClient Caches
Web browsers store viewed resources in a local cache (on hard disk on network drive).
These resources can be re-used.
Potentially these files could be accessed by other users of PC or a system administrator
12
Web Browsers and Privacy
Cookies
Cookies enable information to be stored on your local PC which can be reused by the remote server.
Cookies are useful in applications, such as "shopping baskets", CBL, etc.However there are privacy implications, since cookies can be used to record paths through a website.
13
Information Providers
What personal information is provided on the web?
Corporate Information
Individual /Societies
14
Changing Context
Technologies such as Frames can change the context of resources on the web by:• Pointing to text• Pointing to graphics
There has reportedly been a "Babes on the Web" page. Document held remotely
15
Web Forms
Web forms are now trivial to set up Save time and effort Information may be
reused easily Are information
providers aware of implications of reusing information?
16
System Administrators
System Administrators can:• Read incoming and outgoing
messages and Usenet postings
• Analyse cache log files to find popular websites - and potentially who's been accessing them
• Deny access to specified websites
• Publish statistics on hits to pages
17
Web Statistics
Many web administrators publish their web statistics:
• Access by country• Access by domain
name• Most popular
pages
18
Restricting Access
It is possible to restrict access to sites containing dubious content
It is also possible to record email address and take action if persistent access attempted
Is this:• Sensible action • Breach of privacy?
19
Solutions
There are a variety of solutions to the issues concerned with Personal Data and the Internet:
• Don't use the Internet• Information providers' "tricks"• System administrators' "tricks"• Protocol Developments• Auditing
Education is important throughout
20
Solutions - Denying Access
• Information published on the web can be easily processed by robots
• Can prevent (well-behaved) robots from accessing resources using the Robot Exclusion Protocol (REP) (robots.txt file)
Alta Vista search for "Brian Kelly" gives 2,800 hitsBut:
• Not widely used: ~30% of UK universities• Not easily scaleable (single file at web root)
User-agent: *disallow: /stats/
21
Solutions - For Info Providers
• REP implemented by system administrator• Possible (but not easy?) to create master robot.txt file by merging departmental ones
• HTML 4.0 <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"> element enables individual files to contain robot directives New and not yet widely supported
• Since robots tend not to follow CGI programs, could hide information behind a button Not elegant
Conference Details
Participants
Campus map
<FORM ACTION="part.html"><INPUT TYPE="submit" VALUE="...">
22
Preventing Misuse
There are technical ways of:• Preventing resources from being used in
frames • Preventing images from being "stolen"
Solutions are being considered mainly for copyright protection
However such solutions aren't widely deployed as:• They may prevent the resource from being
reused in valid ways• No user / political pressure?
23
Political Developments
Global Information Networks
• European Conference in Bonn in June 97• Raised issues of:
– Data protection– Technological solutions
• See <URL: http://www2.echo.lu/bonn/conference.html>
24
W3C Response
World Wide Web Consortium (W3C) responded to Bonn paper:
• Summarised technological solutions:– DSig: a web of trust– PICS: content selection without censorship– P3P: privacy project– IPR: intellectual property rights
• See <URL: http://www.w3.org/TR/NOTE-eu-conf-970711>
25
DSig
DSig:• W3C's Digital Signature Initiative• Helps users to decide who to trust• Based on digitally signed assertions:
"This web page comes from Bath University Courses office and gives a legally binding list of courses"
• See <URL: http://www.w3.org/Security/DSig/Activity.html>
26
PICS
PICS:• Platform for Internet Content Selection• Mechanism for rating web pages
e.g. X, A, PG, U
• Decision to accept resource made by end user (or end user organisation)
• Choice devolved - no censorship of originating resource
• See <URL: http://www.w3.org/PICS/>
27
IPR
W3C's IPR activity:• Intellectual Property Rights and the Web:
– Does use of a cache infringe copyright– Can links to resources be made freely– …
• Asks the contentious question:Does the nature of the technology require us to change the legal understanding or status of copyright as it stands now?
• See <URL: http://www.w3.org/IPR/Activity.html>
28
P3P
P3P:• Platform for Privacy Preferences
• Will develop specification and demonstration of way of expressing privacy practices and preferences by Web sites and users
• Architecture and grammar work complete (Oct 1997)
• See <URL: http://www.w3.org/Privacy/Activity.html>
29
P3P Deliverables
General Overview of the P3P Architecture• Document describes the P3P model
Grammatical Model
• Grammar and vocabulary for machine-readable statements:
Data Categories: e.g. name, email, ...
Practices: Use: e.g. system admin, research, customisation
Transfer: divulge information within organisation
Release: divulge info to other organisation
Access: ability of data subject to view information
See <URL: http://www.w3.org/TR/NOTE-IPWG-Practices.html>
30
JTAP Calls
Digital SignaturesStudies to identify appropriate protocols and to test deployment. Seeking to fund an overview report and a technology deployment pilot
Certificate Based Infrastructure ServicesTechnical overview and pilot. Seeking to fund an overview and technology watch project at a cost of £25,000, followed by one or two deployment pilots
Work to start in Dec 1998
See <URL: http://www.jtap.ac.uk/bid/c14_98.html>
31
Privacy Services
TRUSTe:• An "independent, non-profit, privacy initiative
dedicated to building users' trust .. on the Internet"
• TRUSTe sites agree to:– Maintain an approved Privacy Statement
– Explain information gathering practices:
– What personal information will be used for
– Whether information will be disclosed
– Display the TRUSTe's Mark
• TRUSTe will periodically check conformance
• See <URL: http://www.etrust.org/>
32
What's Happening in UK?
Number of universities have provided guidelines governing Internet use:
• Data Protection• Computer Misuse• ..
But:• Is work being duplicated?• Is it still relevant?
http://www.cam.ac.uk/CS/DPA.html
33
What's Needed? Auditing Software WebWatch
• Project based at UKOLN• Monitors web technologies (not content)• Potential for auditing robots.txt files?
Do we want software for auditing at a national or institutional level?
Can we follow the TRUSTe model?
34
What's Needed? Catalogue of GuidelinesA catalogue of UK HE web resources is being produced:
• Uses ROADS (cf. SOSIG, OMNI, etc.)
• Various categories planned:
– AUP– Guidelines for authors– Local search engines
• Feedback welcome
35
What's Needed?EducationNeed for education for:
• End users• Information providers• System administrators• Managers
Who provides training materials?
Who delivers the training?
36
Conclusions
• Widespread use of the Internet / ease of publishing has increased privacy concerns
• Need for education and awareness:– End users– Information providers– System administrators (central & departmental)
• Do we want a system like TRUSTe? • Need for auditing tools locally / nationally?• Need to share experiences• Need to be aware of (implement?)
technical solutions