20
Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen, and Cheng-Zen Yang Dept. of Computer Engineering and Scienc e Yuan-Ze University http://syslab.cse.yzu.edu.tw/ ICADL 2001 - 2001/12/11

Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

Embed Size (px)

Citation preview

Page 1: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

Design of a Search Engine for Metadata Search Based on Metalogy

Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang

Dept. of Computer Engineering and Science

Yuan-Ze University

http://syslab.cse.yzu.edu.tw/

ICADL 2001 - 2001/12/11

Page 2: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

[email protected] YZU, Taiwan - ICADL2001 2

Outline

• Introduction

• Related Technologies

• System Architecture

• An Experimental Prototype

• Conclusions

• Future work

Page 3: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

[email protected] YZU, Taiwan - ICADL2001 3

Introduction

• Metadata management is not an easy task:– It requires specific domain knowledge for

appropriate data categorization.– It needs to deal with the complicated

relationships between the metadata items.– A good management tool for easing metadata

construction and manipulation is necessary.

Page 4: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

[email protected] YZU, Taiwan - ICADL2001 4

Introduction

• Metalogy– Metalogy is a management system developed

by ROSS project group in Taiwan.– It can be used to manipulate various digitized

items and export/import XML records.– It is mainly designed for metadata management

of each digital library.

Page 5: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

[email protected] YZU, Taiwan - ICADL2001 5

Introduction

• Search across digital libraries:– Metalogy does not consider how to search

information across digital libraries.– As digital libraries are widely deployed,

searching information across several digital libraries becomes important.

– We design a search engine to help users find resources without connecting to digital libraries and inputting the same query terms.

Page 6: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

[email protected] YZU, Taiwan - ICADL2001 6

Introduction

• We design this search engine based on the XML data exported from Metalogy for some reasons:– XML/Metalogy provides comprehensive

metadata descriptions and DTD information for metadata search.

– The quality of the distributed service highly depends on the quality of the data resource.

Page 7: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

[email protected] YZU, Taiwan - ICADL2001 7

Related Technologies

• Z39.50 – It was proposed to search and retrieve information from

heterogeneous databases over networks.

– Provide abstract search capability.

– It is difficult to be implemented because of its strengthened functionality.

• OAI – Arc– Arc is developed for cross-archive searching.

– It adopts the OAI protocol to harvest digital archives.

Page 8: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

[email protected] YZU, Taiwan - ICADL2001 8

Related Technologies

• Harp– Harp provides a uniform query interface across legacy

public libraries through HarpSQL.

– A HarpSQL server acts as a query agent for storing and handling the intermediate query results not as a search engine to collect and store all metadata.

• METALICA– It adopts a meta-search engine like MetaCrawler to

provide a uniform user interface for supporting cross-archive search.

Page 9: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

[email protected] YZU, Taiwan - ICADL2001 9

System Architecture

XMLXML Parser

(Java Application)Index

Database

Search Engine(Java Servlet)

DTD Manager(Java Servlet)

UserInterface

ManagerInterface

Query

Request

Metadata

DTD

Digital Library 1

DTD

Browser ‧‧

Digital Library n

Digital Library 2

Page 10: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

[email protected] YZU, Taiwan - ICADL2001 10

System Architecture

• The search engine is constructed with three modules:– Search engine module

• Provide an integrated user interface• Adopt Java servlets to provide search services

– Index database module• Provide metadata repository for digital library

sources.• Adopt simple Dublin Core set as default metadata.• Store DTD mapping relationships.

Page 11: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

[email protected] YZU, Taiwan - ICADL2001 11

System Architecture

– Metadata/DTD manager• Provide an administration interface to manage

XML/DTD mapping relationships .

• Parse and translate the XML/DTD documents provided by remote digital libraries.

• Gather information from remote digital libraries and update the index database repeatedly.

Page 12: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

[email protected] YZU, Taiwan - ICADL2001 12

An Experimental Prototype

• Development tool:– Implement this search engine with Java to reach

platform-independence.– Parse XML information with JAXP (Java API

for XML parsing) package.– The database is constructed with a public

domain database MySQL.

Page 13: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

[email protected] YZU, Taiwan - ICADL2001 13

An Experimental Prototype

• XML/DTD manager

Manage functionality

Page 14: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

[email protected] YZU, Taiwan - ICADL2001 14

An Experimental Prototype

• A mapping example

Mapping information

Page 15: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

[email protected] YZU, Taiwan - ICADL2001 15

An Experimental Prototype

• An search example

A famous calligrapher His-Chih Wang (303-

361 AD)

Page 16: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

[email protected] YZU, Taiwan - ICADL2001 16

An Experimental Prototype

• Search results

Matched metadata

Link to the resource file

Page 17: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

[email protected] YZU, Taiwan - ICADL2001 17

Conclusions

• Present the design of a search engine for searching information across digital libraries based on metadata/XML.

• The design of the search engine has three advantages:– First, the system architecture is simple and the

cost is low.

Page 18: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

[email protected] YZU, Taiwan - ICADL2001 18

Conclusions

– Second, the system extensibility is high for newly required services.

– Third, users need not to know how and where to search information by using this uniform user interface.

Page 19: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

[email protected] YZU, Taiwan - ICADL2001 19

Future Work

• The quality control on the metadata provided by the original digital library source.

• The mapping scheme to support more heterogeneous digital archives should be further discussed.

Page 20: Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science

[email protected] YZU, Taiwan - ICADL2001 20

Future Work

• The performance issue should be further addressed when the environment is in a large scale.

• How to effectively update information from the remote digital libraries is another important work to do.