Upload
ella-watkins
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
1
Accessing Multiple Resources via Z39.50
Paul Miller
Interoperability FocusUK Office for Library & Information Networking (UKOLN)
[email protected] http://www.ukoln.ac.uk/
UKOLN is funded by the Library and Information Commission, the Joint Information Systems Committee (JISC) of the Higher Education Funding Councils, as well as by project funding from the JISC and the European Union.
UKOLN also receives support from the Universities of Bath and Hull where staff are based.
2
Outline
• What is Z39.50?• Some gory details
– Attribute Sets, Profiles, and all…
• Maintenance and development
• What’s wrong with Z39.50?
• The Bath Profile
• The New Attribute Architecture
• How it’s used• Tools, registries, etc.
See http://www.ariadne.ac.uk/issue21/z3950/See http://www.ariadne.ac.uk/issue21/z3950/
3
What is Z39.50?
• ANSI/NISO Z39.50–1995, Information Retrieval (Z39.50): Application Service Definition and Protocol Specification
• ISO 23950:1998, Information and Documentation — Information Retrieval (Z39.50) — Application Service Definition and Protocol Specification.
See http://lcweb.loc.gov/z3950/agency/1995doce.htmlSee http://lcweb.loc.gov/z3950/agency/1995doce.html
4
What is Z39.50?
“This standard specifies a client/server based protocol for Information Retrieval. It specifies procedures and structures for a client to search a database provided by a server, retrieve database records identified by a search, scan a term list, and sort a result set. Access control, resource control, extended services, and a ‘help’ facility are also supported. The protocol addresses communication between corresponding information retrieval applications, the client and server (which may reside on different computers); it does not address interaction between the client and the end-user.”
(Z39.50–1995, page 0).
See http://lcweb.loc.gov/z3950/agency/1995doce.htmlSee http://lcweb.loc.gov/z3950/agency/1995doce.html
5
Some gory details…• Z39.50 follows client/server model
• But calls them Origin and Target
Client/origin
Server/target
6
Client/Server architecture
7
Client/Server architecture
8
Some gory details…
• Z39.50–1995 is divided into eleven ‘Facilities’
Initialization Search
Retrieval Result–set–delete
Browse Sort
Access Control Accounting
Explain Extended Services
Termination.
See http://www.ariadne.ac.uk/issue21/z3950/See http://www.ariadne.ac.uk/issue21/z3950/
9
Facilities and Services
• Each Facility comprises at least one Service• A Service facilitates a particular
interaction between Origin and Target• The three key services are Init,
Search, and Present.
See http://www.ariadne.ac.uk/issue21/z3950/See http://www.ariadne.ac.uk/issue21/z3950/
10
Init
• The only Service of the Initialization Facility
• Origin–initiated
• Used to start a ‘Z–association’• Origin requests a number of
parameters under which the searches will be conducted
• Target responds, either accepting offered parameters or proposing others if necessary.
11
Search
• The only Service of the Search Facility
• Origin–initiated
• Used to actually conduct a search• Origin specifies databases to be
searched, attribute combinations, and query
• Target responds, identifying the number of matching results.
12
Present
• Main Service of the Retrieval Facility (along with Segment)
• Origin–initiated• although Target can initiate a Segment
request if the result set is very large
• Used to return records to the user.
13
Init for dummies
Hello. Do you speak English?
Hello. Yes, I do. Let’s talk.
14
Search for dummies
Cool. Can I have anything you’ve got on a place
called “Bristol”?
I’ve got 25 records matching your request, and here’s the first five. As you didn’t
specify anything else, I’ve sent them to you in MARC, so I hope
that’s OK.
15
Present for dummies25, eh? Can I have the first ten, please? Oh, and I really don’t like
MARC. If you can send Dublin Core that would be great, and if not I’ll
settle for some SUTRS.
DC:Creator – blahDC:Title – blah…
16
Now it gets hairy…
• To communicate successfully, Origin and Target need to use the same Attribute Set.• An Attribute Set like Bib–1 defines six
forms of Attribute —– Use– Relation– Truncation– Completeness– Position– Structure.
17
Use Attributes
• Define the ‘access points’ on which a search takes place• Title, author, subject, etc.
See http://lcweb.loc.gov/z3950/agency/defns/bib1.htmlSee http://lcweb.loc.gov/z3950/agency/defns/bib1.html
18
Relation Attributes
• Defines the relationship between the search term and values stored in the database/index• Less than, greater than, equal to,
phonetically matched, etc.
19
Truncation Attributes
• Defines which part of the stored value is to be searched on• Beginning of any word, end of any
word, etc.• ‘Smith’ finds ‘Smithsonian’ and not
‘Wordsmith’, and vice versa.
20
Completeness Attributes
• Defines how much of the stored index term must be in the search term• ‘Smith’ finds ‘Smith’, but not
‘Smithsonian’ or ‘the Smith’, etc.
21
Position Attributes
• Defines where in the index the search term should be located• At the start of the field, anywhere, etc.
22
Structure Attributes
• Specifies the form to be searched for• Word, phrase, date, etc.
23
Record Syntaxes• Record Syntaxes define the structure in which
results are returned to the Origin.• This does not mean that Targets need to store data
in these formats
• MARC• UKMARC, USMARC/MARC21, DANMARC, MARB,
UNIMARC…
• SUTRS• Simple Unstructured Text Record Syntax
• GRS–1• Generic Record Syntax
• XML.
24
Profiles• Groupings of Attribute Sets, Record
Syntaxes, etc. to meet specific needs• Disciplinary
– Cultural Heritage (CIMI)– Geospatial (GEO)
• Geographic/Cultural/National– Texas Profile– OPAC Network for Europe (ONE)– Conference of European National Librarians (CENL)
• Functional– Collections Profile
• Etc.
25
• Z39.50 Maintenance Agency• Based at Library of Congress,
and officially responsible for upkeep of the standard
• ZIG• Z39.50 Implementor’s Group• Informal grouping of vendors, users and
implementors who work to progress new areas of the standard
• Next meeting in Texas in January• Likely to be at UKOLN in 2001.
Maintenance and Development
See http://www.loc.gov/z3950/agency/See http://www.loc.gov/z3950/agency/
26
What’s wrong with Z39.50?• Profiles for each discipline
• Defeats interoperability?
• Vendor interpretation of the standard
• Bib–1 bloat
• Largely invisible to the user
• Seen as complicated, expensive and old–fashioned
• Surely no match for XML/RDF/ whatever.
27
The Bath Profile
• System vendors implement areas of the Z39.50 standard differently
• Regional, National, and disciplinary Profiles have appeared over previous years, many of which have basic functions in common
• Users wish to search across national/regional boundaries, and between vendors.
See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/
See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/
28
Learning from the past
• The Bath Profile is heavily influenced by• ATS–1• CENL• DanZIG• MODELS• ONE• Z Texas• vCUC
See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/
See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/
29
Learning from the past
See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/
See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/
30
Doing the work
• ZIP–PIZ–L mailing list, hosted by National Library of Canada
• Meeting face–to–face• The UK’s Joint Information Systems
Committee (JISC) supported a face–to–face meeting in Bath over the summer
• A draft, being widely circulated for comment.
See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/
See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/
31
What we proposed
• Minimisation of ‘defaults’• Where possible, every attribute is defined in the Profile
(Use, Relation, Position, Structure, Truncation, Completeness)
• Three Functional Areas• Basic Bibliographic Search & Retrieval• Bibliographic Holdings Search & Retrieval• Cross–Domain Search & Retrieval
• Three or more Levels of Conformance in each Area.
See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/
See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/
32
What we proposed
• SUTRS and one of UNIMARC or MARC21 for Bibliographic Search results• Or all three at Level 1?
• SUTRS and Dublin Core (in XML) for Cross–Domain results
• Other record syntaxes also permitted, but conformant tools must support at least these.
33
The new Attribute Architecture• Recognition of existing problems
• Probably 2–3 years away in mainstream implementations?
• Deals with Bib–1 bloat by identifying key attributes of value to multiple applications, and grouping them together
– Utility Attribute Set (description of records)– Cross–Domain Attribute Set (description of
resources, and closely related to Dublin Core element set)
– Bib–2etc.
34
The new Attribute Architecture
New Attribute Type Relation to Bib–1 Attributes
Access Point Use
Semantic Qualifier new
Language new
Content Authority new
Expansion/ Interpretation
Truncation and some of Relation
Normalized Weight new
35
The new Attribute Architecture
New Attribute Type Relation to Bib–1 Attributes
Hit Count new
Comparison Most of Relation and part of
Completeness
Format/ Structure Structure
Occurrence Completeness
Indirection new
Functional Qualifier new
36
Using Z39.50
• Z39.50 widely deployed in the library sector and elsewhere, although often invisibly• The Origin can be either a human user
or a second Origin computer– e.g. Z39.50 portals, summing resources
from multiple targets
• Users access Z39.50 Targets using proprietary clients or, increasingly, via web interfaces
– e.g. WinWillow, ZNavigator, many WOPACs.
37
Using Z39.50© A
rts & H
umanities D
ata S
ervice
38
Using Z39.50© A
rts & H
umanities D
ata S
ervice
39
Using Z39.50© U
niversity of C
alifornia
40
Using Z39.50© U
niversity of C
alifornia
41
Building the DNER
• Distributed National Electronic Resource• Policy aspiration of the Joint Information
Systems Committee• Intended to provide greater access to JISC’s
Current Content Collection– RDN– AHDS– MIMAS– EDINA– The Data Archive– EDUSERVE– eLib projects
etc.
42
Building the DNER
• Construction of Bath Profile–conformant Z39.50 Targets at data sources
• Construction of various Portals to facilitate access• ‘JISC Portal’ ?• Data Centre Portals• Subject Portals• Data Type Portals• Institutional Portals• Personal Portals ?
43
Building the DNER
• Remaining challenges• Authentication hell
– Move from endless authentication to single authentication
• Alignment of different data types– Ordnance Survey maps at Edinburgh– Satellite imagery in Manchester– Electronic journal articles in many formats, etc.– Census data at the Data Archive– Survey data in Manchester– Chemical structures in Manchester
• Collection Level Description.