15
Google Confidential Daniel Clancy Engineering Director, Google Print 18-July-05

Google Confidential Daniel Clancy Engineering Director, Google Print 18-July-05

Embed Size (px)

Citation preview

Page 2: Google Confidential Daniel Clancy Engineering Director, Google Print 18-July-05

Google Confidential

Making Information Accessible

Arpanet Team

Page 3: Google Confidential Daniel Clancy Engineering Director, Google Print 18-July-05

Google Confidential

Google’s Mission

Online ContentBillions of web pages

Offline ContentBillions of items still unindexed

3

To organize the world’s information and make it universally accessible and useful.

Page 4: Google Confidential Daniel Clancy Engineering Director, Google Print 18-July-05

Google Confidential

Two Initiatives

Library Program~85% of books are out of print and/or out of copyright – these books are only found in libraries

4

Publisher Program

GOAL: Create a comprehensive virtual card catalog of all books in all languages, while respecting publishers’ rights

Only ~15% of books are in print

Page 5: Google Confidential Daniel Clancy Engineering Director, Google Print 18-July-05

Google Confidential

Google Print Library Project

Page 6: Google Confidential Daniel Clancy Engineering Director, Google Print 18-July-05

Google Confidential

Really, how many books?

Library of Congress 24,616,867

Harvard University 14,437,361

Chicago Public Library 10,994,943

New York Public Library 10,608,570

Yale University 10,492,812

Queens Borough Public Library 10,357,159

Oxford University 10,000,000

…. ….

University of Michigan 7,348,360

Stanford University 7,286,437

Library Holdings

Page 7: Google Confidential Daniel Clancy Engineering Director, Google Print 18-July-05

Google Confidential

92% of the world's books are neither generating revenue for the copyright holder nor easily accessible to potential readers.*

The value is in the middle

A Typical Library Collection

In-PrintPartner Program

Public DomainBooks publishedbefore 1923

Unclear copyright statusBooks after 1923 but…• May be in copyright, but not for sale • Rights may have reverted to author• May be in the public domain

Less than 20%**~65% or more

15%

*Source:  Covey, Denise Troll.  "Global Cooperation for Global Access:  The Million Book Project“**OCLC analysis of the Google Books Library Project: http://www.dlib.org/dlib/september05/lavoie/09lavoie.html   

~15%

Page 8: Google Confidential Daniel Clancy Engineering Director, Google Print 18-July-05

Google Confidential

Three User Experiences

Sample Pages View Full Book ViewSnippet View

20%65% or more*~15%

Page 9: Google Confidential Daniel Clancy Engineering Director, Google Print 18-July-05

Google Confidential

A Closer Look at the Snippet View

• User can view:

• Bibliographic info

• A few sentences around the query

• Restricted searching

• Same 3 snippets, never more

• Links to purchase

• In-print – online bookstores

• Out of print – used bookstores

For books we scan that are still in copyright

Page 10: Google Confidential Daniel Clancy Engineering Director, Google Print 18-July-05

Google Confidential

Ego-searching and discovering your past

“Never before could I have found such an obscure and wonderful gem. Google Book Search prompted me to buy two copies of a book that I never would have known about, otherwise.”

- Bernie Robichau, Columbia, SC

Page 11: Google Confidential Daniel Clancy Engineering Director, Google Print 18-July-05

Google Confidential

Num

ber

of

Sea

rch

Que

ries/

Key

wo

rd

Keywords

It’s Not Just About Our Most Popular Searches…

11

Harry Potter

Wireless Home Networking

Peruvian Orchids

Jersey City

What are people searching for? Everything

1

2

34

Page 12: Google Confidential Daniel Clancy Engineering Director, Google Print 18-July-05

Google Confidential

Web of Off-line Content

• How do you create an ontology of objects from the off-line world with the myriad of links that connect these objects

Some Relationships that exist:• FRBR Hierarchy

• Work, manifestation, hierarchy• References• Authorship• Criticism and review• Inclusion• Individuals• Events• Temporal relationships• Different perspectives• Topical similarity

Page 13: Google Confidential Daniel Clancy Engineering Director, Google Print 18-July-05

Google Confidential

Google and the Textbook

Purpose of a textbook

Relation to the Web

Relation to the Library and Books

Creating Dynamic Links that make content come alive

Personalization and Customization

Page 14: Google Confidential Daniel Clancy Engineering Director, Google Print 18-July-05

Google Confidential

Google Print Future Developments

• Google Print is intended to be a catalyst for more digitization efforts

• We will work to include books from other digitization efforts

• We will strive to create products that all libraries can leverage (e.g., OPAC integration, restricted library search)

Page 15: Google Confidential Daniel Clancy Engineering Director, Google Print 18-July-05

Google Confidential

Back Up