31
The world’s libraries. Connected. Multilingual WorldCat presented by Janifer Gatenby IFLA, Singapore, 2013-08-19 Karen Smith Yoshimura Eric Childress Janifer Gatenby Jean Godby Richard Greene Jenny Toves Diane Vizine Goetz Robert Bremer JD Shipengrover Gail Thornburg Jay Weitz

Multilingual presentation ifla 2013 08-19

Embed Size (px)

DESCRIPTION

Data mining OCLC for translations. Creating authority records for VIAF. Remodelling the bibliorgraphic structure to make the best mutli-lingual displays from all available data in a work set.

Citation preview

Page 1: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

Multilingual WorldCatpresented by Janifer Gatenby

IFLA, Singapore, 2013-08-19

Karen Smith Yoshimura

Eric Childress

Janifer Gatenby

Jean Godby

Richard Greene

Jenny Toves

Diane Vizine Goetz

Robert Bremer

JD Shipengrover

Gail Thornburg

Jay Weitz

Page 2: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

WorldCat Today

• Resources in nearly all languages

• Contributed by more than 20,000 libraries worldwide

• More than half the database is for works not in English

Page 3: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

WorldCat Today

• Bibliographic Records

• Hybrid records

• Parallel records

• Clustered at Work level (FRBR)

Page 4: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

Existing Architecture

AuthorsAuthor

sAuthors

SubjClassifSubj

ClassifSubjClassif

HoldingHoldin

gHoldings

Bibliographic record

Work cluster

Content cluster

Manifestation cluster

Page 5: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

Complementary Initiatives

Work Level Record

GLIMIRManifestation

& Content Clusters

Multi-lingual Bibliographic

Structure

Page 6: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

Work Level Record

http://www.oclc.org/research/activities/workrecs.html

Page 7: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

Create a landing page summarizing content for a work

Work Level Record: Objective

Page 8: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

• The Content Cluster• Enables better work record displays by reducing the number of lines that

display for large works

• Enables a choice of format and presents the formats that could be acceptable substitutes

• Consolidates holdings for identical content 

• The Manifestation Cluster is important

• Consolidates holdings at manifestation level

• In the short term allows the record catalogued in the language of the interface to be chosen for display

• Reduces apparent duplication

• Allows a more accurate count of the number of manifestations in WorldCat (as opposed to the number of records)

 

GLIMIR

Page 9: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

Creates true multi-lingual displays• At work and manifestation levels

• Using all available data instead of “most appropriate record”

• Generates data

Corrects many of the 28 million records coded “und”

Better control and linking of translations

Input to refinement of work clusters

Smarter data storage

Multilingual Bibliographic Structure Project

Page 10: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

• Worldcat.org selects the most appropriate record to show to a user as representative of the work in the short result list and beyond

• The end result will not be very satisfactory from a multi-lingual viewpoint… here’s why

“Most appropriate” questioned

Page 11: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

Which record is better to present to a German speaker?

Page 12: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

Incomplete Swedish Record

Page 13: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

Hybrid record

Page 14: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

Most appropriate display

Build the display from all available

data

Page 15: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

• Work level data, mined from all associated bibliographic records will be displayed supplemented with expression / manifestation level data as the user drills through the short to fuller versions of the metadata.

Multilingual Bibliographic Structure Project

End user interface will show works and manifestations not bibliographic records; the cataloguing client will also show bibliographic records

Page 16: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

Proposed new architecture

Work

eng

fre

ger

jpn

ManifengManif

engManifeng

Manifeng

Manifeng Manif

eng

o freNotesContents

++

HoldingHoldin

gHolding

Holding

Subjsif

SubjClassif

eng

freger

jpn

AuthorsAuthor

sAuthorseng

fre

ger

jpn

eng

fre

ger

jpn

eng

fre

ger

jpn

Translations (Language of work)

Maniffre

Holding

Page 17: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

• Language tagging of elements, particularly

• Summaries (M21 520)

• Subject headings

• Display in script preferred by the user if data is available

• Improve translated interfaces

• Show consolidated holdings as appropriate

Important principles

Page 18: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

Page 19: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

Page 20: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

Page 21: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

Page 22: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

Translations

Page 23: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

• The cream of the world’s cultural and knowledge heritage is shared by being translated

• WorldCat contains many rich cataloguing records for these translations

Great works are translated

GOAL: Data mine the really good records to improve clustering, presentation, authority

records and linked data

Page 24: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

• Inconsistencies causing work clusters to be incomplete & less than optimal search results

• Titles without subtitles

• Different forms of uniform title or missing uniform title

• Inverted title

• Different coding of original and translated information

Translations

Generated uniform title authority records will overcome most of these differences without needing to edit individual records

Page 25: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

• Improve FRBR work groups

• Made by data mining

• Contribute to VIAF

• Diffuse via VIAF as linked data

• Possibility to create web page / web service

Generate uniform title authority records

Page 26: Multilingual presentation ifla 2013 08-19
Page 27: Multilingual presentation ifla 2013 08-19
Page 28: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

Page 29: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

Translation records in VIAF

• Will enrich VIAF significantly

• New elements - translated title and translator

Author Title Expressions in VIAF Translation count in WorldCat

Atwood Blind assassin 8 31

Guevara Notas de viaje 0 11

Hawking Grand design 0 18

Lenard Grosse naturforscher 1 3

Loti Pêcheur d’Islande 1 31

Page 30: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

• Records are freely available to the world from VIAF in

• MARC-21

• XML

• RDF (linked data)

• Just links in JSON

• And other formats as introduced

Diffusion of Translation records

Page 31: Multilingual presentation ifla 2013 08-19

The world’s libraries. Connected.

• # of manifestations as opposed to # of records

• # of works that have translations

• Top translated authors and works

• And more

We don’t know now, but soon will