26
OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

Embed Size (px)

Citation preview

Page 1: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

OCoLR 20041025 #53928015 OCLCR

Making data work harder

Lorcan Dempsey

OCLC Members Council17 May 2005

Page 2: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

May 2005 Members’ Council

Web hub services OWC Presentation Examples

A comprehensive discovery experience

Yes

Predictable, often immediate, fulfilment

In progress

Data works

hard

Being improved Yes Curioser

FAST

Open to intermediate consumers

In progress

Co-created with users

Not yet Yes WorldCat

Wiki

Page 3: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

May 2005 Members’ Council

Page 4: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

May 2005 Members’ Council

Making data work hard

The user experience: from search to rich browse Capturing user contribution

Data mining

Page 5: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

May 2005 Members’ Council

Context: value

Amazoogle: we can add significant value. We should be looking for organizational frameworks within which we can do this.

ROI: libraries invest in data but do not extract as much value as they might from it. Unless we release more value, then the argument for this investment becomes weaker. The user experience Management intelligence

Page 6: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

May 2005 Members’ Council

Page 7: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

May 2005 Members’ Council

Page 8: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

May 2005 Members’ Council

Page 9: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005
Page 10: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

May 2005 Members’ Council

Page 11: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005
Page 12: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005
Page 13: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

May 2005 Members’ Council

Top Sets for Fiction (Records)

Record Keys

1,296 defoe, daniel\1661 1731/robinson crusoe

1,267 carroll, lewis\1832 1898/alices adventures in wonderland

971 cervantes saavedra, miguel de\1547 1616/don quixote

828 stevenson, robert louis\1850 1894/treasure island

689 twain, mark\1835 1910/adventures of huckleberry finn

624 twain, mark\1835 1910/adventures of tom sawyer

618 swift, jonathan\1667 1745/gullivers travels

Page 14: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005
Page 15: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005
Page 16: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005
Page 17: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

May 2005 Members’ Council

FRBR & FAST

FRBR ‘Interim FRBR’ in OWC FRBR in research projects

FictionFinder Curioser xISBN Algorithm Top 1000

FRBR in FirstSearch – late this year

Curioser ….

FAST Moving FAST headings

into OpenWorldCat Experiment: mapping

Yahoo! categories to FAST headings

Recognized value …

Page 18: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

May 2005 Members’ Council

WIKI in WorldCat

Capture user input in structured ways

Page 19: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

May 2005 Members’ Council

Extending Wiki’s utility

Wiki: supported markup:

wikitext page editing:

a single text block

searches: full text searching

collections managed: one per wiki

MetaWiki: supported markup:

wikitext structured data (e.g., MARC,

METS, DC…) page editing:

a single text block, or, field level

searches: full text searching fielded searching

collections managed: one/multiple per MetaWiki

Built on top of standards (OAI, OpenURL, SRU)

Page 20: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005
Page 21: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

May 2005 Members’ Council

Management intelligence: data mining

Data Bibliographic data Transaction logs …

Need to mine this data for intelligence that creates value for libraries and users

OCLC Research undertaking a number of data-mining projects aimed at: Knowing more about the characteristics of library collections Creating interesting and useful data displays Generating intelligence to support library decision-making

Page 22: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

May 2005 Members’ Council

Know Your Audience!

Implies: we can infer materials’ audience level from holdings patterns, which in turn can support:• Collection management• Readers’ advisory services• Reference services• Information retrieval

Holdings represent selection decisions by librarians … implies there are about 1 billion individual selection decisions in the WorldCat holdings file

Selections are made to serve the interests of a library’s target community …• Associate target community (audience level) to particular library profiles - e.g., ARL, non-ARL academic, public, K-12 school …

Paper forthcoming!

?

Page 23: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

May 2005 Members’ Council

The Implications of Google Libraries …

Potentially covers about one third of print books in WorldCat

~60 percent of total G5 books held by only one of the Google 5

Less than 5 percent held by all of the Google 5

~20 percent of total G5 print books out of copyright

Paper forthcoming …

Page 24: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

May 2005 Members’ Council

“Last Copy”: Identifying At-Risk Materials

~23 million WorldCat records have only a single holding attached

Libraries need to know what portions of their collections are:Rare … Rare and valuable …“Last copy” (artifact and/or content)

Identification of rare materials essential intelligence in support of storage, digitization, and preservation decision-making

Data-mining study of Vanderbilt holdings in WorldCat:• Identified 23,000 items held uniquely by Vanderbilt

• ~60 % are print books• ~60 % produced prior to 1950; ~25 % produced after 1970

Paper forthcoming!

Page 25: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

May 2005 Members’ Council

Looking at Library Print Book Collections … Systematically

32 million print books, representing26 million distinct works

Half of print books published after1977; more than 80% still “in copyright”

Rareness is common! Only a third of print books have more than five holdings; half have two or less

OCLC/Ithaka collaboration: Use WorldCat to characterize the “system-wide” print book collection – i.e., aggregate print book holdings in WorldCat

Intelligence of this kind can help establish digitization prioritiesand inform preservation planning

More information: http://www.oclc.org/research/presentations/lavoie/cni2005.ppt

Only about 120,000 works had bothprint book and e-book manifestations

Page 26: OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005

May 2005 Members’ Council

Thank you!

OCLC Research:

http://www.oclc.org/research/