Upload
mervin-lang
View
218
Download
0
Embed Size (px)
Citation preview
Google Confidential
Daniel ClancyEngineering Director, Google Print
18-July-05
Google Confidential
Making Information Accessible
Arpanet Team
Google Confidential
Google’s Mission
Online ContentBillions of web pages
Offline ContentBillions of items still unindexed
3
To organize the world’s information and make it universally accessible and useful.
Google Confidential
Two Initiatives
Library Program~85% of books are out of print and/or out of copyright – these books are only found in libraries
4
Publisher Program
GOAL: Create a comprehensive virtual card catalog of all books in all languages, while respecting publishers’ rights
Only ~15% of books are in print
Google Confidential
Google Print Library Project
Google Confidential
Really, how many books?
Library of Congress 24,616,867
Harvard University 14,437,361
Chicago Public Library 10,994,943
New York Public Library 10,608,570
Yale University 10,492,812
Queens Borough Public Library 10,357,159
Oxford University 10,000,000
…. ….
University of Michigan 7,348,360
Stanford University 7,286,437
Library Holdings
Google Confidential
92% of the world's books are neither generating revenue for the copyright holder nor easily accessible to potential readers.*
The value is in the middle
A Typical Library Collection
In-PrintPartner Program
Public DomainBooks publishedbefore 1923
Unclear copyright statusBooks after 1923 but…• May be in copyright, but not for sale • Rights may have reverted to author• May be in the public domain
Less than 20%**~65% or more
15%
*Source: Covey, Denise Troll. "Global Cooperation for Global Access: The Million Book Project“**OCLC analysis of the Google Books Library Project: http://www.dlib.org/dlib/september05/lavoie/09lavoie.html
~15%
Google Confidential
Three User Experiences
Sample Pages View Full Book ViewSnippet View
20%65% or more*~15%
Google Confidential
A Closer Look at the Snippet View
• User can view:
• Bibliographic info
• A few sentences around the query
• Restricted searching
• Same 3 snippets, never more
• Links to purchase
• In-print – online bookstores
• Out of print – used bookstores
For books we scan that are still in copyright
Google Confidential
Ego-searching and discovering your past
“Never before could I have found such an obscure and wonderful gem. Google Book Search prompted me to buy two copies of a book that I never would have known about, otherwise.”
- Bernie Robichau, Columbia, SC
Google Confidential
Num
ber
of
Sea
rch
Que
ries/
Key
wo
rd
Keywords
It’s Not Just About Our Most Popular Searches…
11
Harry Potter
Wireless Home Networking
Peruvian Orchids
Jersey City
What are people searching for? Everything
1
2
34
Google Confidential
Web of Off-line Content
• How do you create an ontology of objects from the off-line world with the myriad of links that connect these objects
Some Relationships that exist:• FRBR Hierarchy
• Work, manifestation, hierarchy• References• Authorship• Criticism and review• Inclusion• Individuals• Events• Temporal relationships• Different perspectives• Topical similarity
Google Confidential
Google and the Textbook
Purpose of a textbook
Relation to the Web
Relation to the Library and Books
Creating Dynamic Links that make content come alive
Personalization and Customization
Google Confidential
Google Print Future Developments
• Google Print is intended to be a catalyst for more digitization efforts
• We will work to include books from other digitization efforts
• We will strive to create products that all libraries can leverage (e.g., OPAC integration, restricted library search)
Google Confidential
Back Up