Upload
helen-bond
View
219
Download
2
Tags:
Embed Size (px)
Citation preview
CLiMB: Computational Linguistics for
Metadata Building
Center for Research on Information Access
Columbia University Libraries
CLiMB - Columbia University
CLiMB: Interdisciplinary Research Project at Columbia
UniversityFunded by Mellon Foundation 2002-2004
• Center for Research on Information Access (CRIA)
• Libraries• Computer Science Department
CLiMB - Columbia University
Problems in Image Access
Cataloging digital images Traditional approach:
manual expertise labor intensive expensive
Can automated techniques help?
CLiMB - Columbia University
Can we harvest image descriptors?
04/19/23 5CLiMB - Columbia University
CLiMB Technical ContributionCLiMB will identify and extract
•proper nouns•terms and phrases
from text related to an image:
September 14, 1908, the basis of the Greenes' final design had been worked out. It featured a radically informal, V-shaped plan (that maintained the original angled porch) and interior volumes of various heights, all under a constantly changing roofline that echoed the rise and fall of the mountains behind it. The chimneys and foundation would be constructed of the sandstone boulders that comprised the local geology, and the exterior of the house would be sheathed in stained split-redwood shakes. —Edward R. Bosley. Greene & Greene. London : Phaidon, 2000. p. 127
CLiMB - Columbia University
Overall Goals
• Research: Development of richer retrieval through increased numbers of descriptors
• Practice: Development of suite of CLiMB tools• Resources: Vocabulary list which can be used by
other visual resource professionals
The essence of CLiMB: • Use scholars themselves as “catalogers” by utilizing
scholarly publications• Enhance existing descriptive metadata
CLiMB - Columbia University
CLiMB Project Teams
Coordinating
Collections(Curatorial)
Technical
ExternalAdvisory
CLiMB - Columbia University
Coordinating
Judith Klavans
Stephen Davis
Angela Giral
Patricia Renfro
Bob Wolven
Curatorial
CLiMB Committees
Judith Klavans
Stephen Davis
Angela Giral
Amy Heinrich
David Magier
Bob Scott
Bob Wolven
Roberta Blitz
Technical
Stephen Davis
Judith Klavans
Vera Horvath
David Elson
Roberta Blitz
CLiMB - Columbia University
Squeezing Metadata out of Scholarly Texts
• Image collection
• Associated text
• Target object identification (TOI)
• CLiMB suite of tools
• Evaluation
CLiMB - Columbia University
Phase
Inputs CLiMB Processes
I
II
III
User Evaluation
process texts
select metadata from texts
use CLiMB metadata in image
search platform
Art Librarians Subject Specialists
Catalogers
Search & Retrieval Experts
end-users
Image Search Platform with CLiMB Metadata
Image Search Platform
Source TEXT
TOIs
AAT / BBIs / etc.
Other Texts
Test Records
Core Descriptive Records
CLiMB Enriched Descriptive Records
Select words & phrases to include in Core Descriptive Records
Result: Enriched XML
Run CLiMB Suite of Tools
Generate TEI Markup
Imag
e C
olle
ctio
ns
CLiMB - Columbia University
Squeezing Metadata out of Scholarly Texts
• Image collection
• Associated text
• Target object identification (TOI)
• CLiMB suite of tools
• Evaluation
CLiMB - Columbia University
CLiMB Collections
• Greene & Greene Architectural Drawings, Avery Architectural and Fine Arts Library
• Chinese Paper Gods, C.V. Starr East Asian Library
• Photographs from the Archives, American Institute of Indian Studies
CLiMB - Columbia University
Greene & Greene Architectural Records and
Papers Collection Drawings and ArchivesAvery Architectural and Fine Arts Library
Columbia University Libraries
CLiMB - Columbia University
Charles Sumner Greene
(1868-1957)
Henry MatherGreene
(1870-1954)
CLiMB - Columbia University
NYDA.1960.001.00023
All Saints Episcopal Church (Pasadena, Calif.). Alterations1902-1903
CLiMB - Columbia University
Greene & Greene Catalog Record
Author: Greene & Greene.Title: [Mrs. Dudley P. Allen house, 1188 Hillcrest Avenue (Pasadena, Calif.).
Alterations.]Residence of Mrs. Dudley P. Allen, 1188 Hillcrest Ave., Pasadena, Cal.
[graphic] : Alteration / Greene & Greene, Architects. Published: [1917]
Physical Details: 4 sheets : various media ; 87.8 x 57.3 cm. (34 5/8 x 22 5/8 in.)Location: Columbia University, Avery Architectural Drawings
Other Authors: Greene, Charles Sumner, 1868-1957. Greene, Henry Mather, 1870-1954.
Subjects: HousesAlterationsArchitecture--Designs and plans--United States.Mrs. Dudley P. Allen house, 1188 Hillcrest Avenue (Pasadena, Calif.)
Component Item: [1] Item no. NYDA.1960.001.03224. [AVERYimage]. Electric lighting -- floor plan, part plan of basement : Sheet no.
Component Item: [2] Item no. NYDA.1960.001.00073. [AVERYimage]. [Electric lighting] -- floor plan, part plan of basement.
CLiMB - Columbia University
Greene & Greene Bibliography
• Bosley, Edward R. Greene & Greene. London : Phaidon, 2000.
• Current, William R. Greene & Greene: architects in the residential style. Fort Worth [Tex.] : Amon Carter Museum of Western Art, [1974]
• Makinson, Randell L. Greene & Greene: architecture as fine art. Salt Lake City : Peregrine Smith, c1977.
• Makinson, Randell L. Greene & Greene: the passion and the legacy. Salt Lake City : Gibbs and Smith, c1998.
• Smith, Bruce. Greene & Greene masterworks. San Francisco : Chronicle Books, c1998.
• Strand, Janann. A Greene & Greene guide [Pasadena, Calif. : G. Dahlstrom, 1974]
CLiMB - Columbia University
CLiMB - Columbia University
C.V. Starr East Asian Library, Columbia University
Chinese Paper GodsAnne S. Goodrich Collection
CLiMB - Columbia University
Pan-hu chih-shenGod of tigers
CLiMB - Columbia University
Title: Chuang gong chuang mu [graphic].
Published: [193-]
Physical Details: 1 print : wood-engraving, color ; 34 x 30 cm.
In: Anne S. Goodrich Collection.
Location: Columbia University, C.V. Starr East Asian Library (CJK)
EAX GAC 1 no. 16
Subjects: Gods, Chinese, in art.
Folk art--China.
Genre Or Form: Woodcuts--Chinese.
Notes: Date according to time period Anne S. Goodrich collected prints in Beijing.
Record ID: NYCP02-F20
Chinese Paper Gods Catalog Record
CLiMB - Columbia University
Chinese Paper Gods Bibliography
• Day, Clarence Burton. Chinese peasant cults : being a study of Chinese paper gods. Taipei : Ch'eng Wen Pub. Co., 1974.
• Goodrich, Anne Swann. Peking paper gods : a look at home worship. Nettetal : Steyler Verlag, 1991.
• Laing, Ellen Johnston. Art and aesthetics in Chinese popular prints: selections from the Muban Foundation collection. Ann Arbor, MI : Center for Chinese Studies, University of Michigan, c2002
CLiMB - Columbia University
HEADING: Nezha (Chinese deity)Used For/See From: Daluoxian (Chinese deity)
Jinhuan Yuanshuai (Chinese deity)Jinkang Yuanshuai (Chinese deity)Li Nezha (Chinese deity)Luoche Taizi (Chinese deity)Ne Zha (Chinese deity)Nezhataizi (Chinese deity)No-cha (Chinese deity)Nuozha (Chinese deity)Tailuoxian (Chinese deity)Taizi Yuanshuai (Chinese deity)Taiziyeh (Chinese deity)Yühuang Taizi (Chinese deity)Zhongtan Yuanshuai (Chinese deity)
Search Also Under: Gods, Chinese
Chinese gods: selection from LC Authority File
CLiMB - Columbia University
CLiMB - Columbia University
CLiMB - Columbia University
Three Testbed Collections
• Greene & Greene
• detailed records
• more difficult to associate text with image
• Chinese Paper Gods
• strong associations
• problems with transliteration and variants
• South Asian Temples
• large set of digital images
• diacritics and variants
CLiMB - Columbia University
CLiMB Collections: Future
• Additional collection of digital images• Close association between image and text• Regularized metadata
Suggestions:• Catalogue raisonné• Museum collection catalog • Exhibition catalog
CLiMB - Columbia University
Squeezing Metadata out of Scholarly Texts
• Image collection
• Associated text
• Target object identification (TOI)
• CLiMB suite of tools
• Evaluation
CLiMB - Columbia University
Target Object Identification (TOI)
• Define based on institutional needs
• Varies from collection to collection– Greene & Greene – Project – Chinese Paper Gods – Deity– South Asian Temples –Location & Temple
• Compile authority list
CLiMB - Columbia University
CLiMB - Columbia University
Project Name Matching
• Locate project names in Greene & Greene• Challenge: finding variant name forms
– Robert R. Blacker house (TOI)– Blacker estate– The house
• Possible techniques to improve matching– Developing a semi-automatic technique– Use existing information to label text– An iterative platform for manual intervention
CLiMB - Columbia University
Squeezing Metadata out of Scholarly Texts
• Image collection
• Associated text
• Target object identification (TOI)
• CLiMB suite of tools
• Evaluation
CLiMB - Columbia University
CLiMB Suite of Tools
http://www.columbia.edu/cu/cria/climb/presentations.html
CLiMB - Columbia University
Squeezing Metadata out of Scholarly Texts
• Image collection
• Associated text
• Target object identification (TOI)
• CLiMB suite of tools
• Evaluation
CLiMB - Columbia University
Next Steps – CLiMB Evaluation
Current Developments• Meeting with experts – October 17th• Survey with experienced image searchers
Long Term Goal• Test CLiMB tools and data in an image
search platform
CLiMB - Columbia University
CLiMB: Computational Linguistics for Metadata Building
• Image collection
• Associated text
• Target object identification (TOI)
• CLiMB suite of tools
• Evaluation
CLiMB - Columbia University
Thank you!
Any questions?
www.columbia.edu/cu/cria/climb