29
Grid-enabled for Digital Archives: The Development and Applications of SRM- SRB interface for DataGrid Services Wei-Long UENG Academia Sinica Grid Computing Taiwan

Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

  • Upload
    others

  • View
    48

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

Grid-enabled for Digital Archives: The Development and Applications of SRM-

SRB interface for DataGrid Services

Wei-Long UENG

Academia Sinica Grid Computing

Taiwan

Page 2: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

Outline• Introduction to the National Digital

Archive Program Data Grid Services• Deployment of Data Grid as the

Information Infrastructure• Interoperation

– SRM-SRB interface implementation and application

• Summary

Page 3: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

Goals of NDAP• Preserve national cultural collections• Popularize fine cultural holdings• Revitalize cultural heritage and cultural

development• Invigorate cultural, content and value-added

industries• Enhance research, education and learning• Promote knowledge and information sharing• Improve literacy, creativity and quality of life• Embrace international communities and

collaboration Cultural, Academic, Socio-economic & Educational (CASE) Values

Page 4: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

Digital Archives in Taiwan

• Three Levels– Archive Level

• high resolution, for preservation purposes• accessible on a case-by-case basis

– Open-Market Level• medium resolution, for value-added, commercial

purposes• accessible for a fee by membership or by licensing

agreement

– Public Information Access Level• low resolution, for educational purposes• accessible to the public free of charge

Page 5: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

Information Infrastructure

• Digital archives/libraries are widely recognized as a crucial component of a global information infrastructure

• Pursue research and development efforts on many aspects of digital archive/library technologies:– (1) the establishment of standardized

information reference guidelines for digital content creation, storage, and processing

– (2)the development of common and application-specific information processing infrastructure and tools

Page 6: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

Requirements• Long-Term Preservation and Data Creation

– preserving ability to read (physically) and understand (logically)

• Full Spectrum and Precise Metadata in Collection, Object and Management Level

• Workflow Support: Digital Information Life-Cycle– Create--> Content Analysis & Annotation--> IPR

Protection --> Repurposing-->Multi-modal/Integrative Search --> Archive

• Data Exploration across Institutional and Disciplinary Domains

• Petabyte Scale Storage Management with Performance

Page 7: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

Requirements• Long-Term Preservation and Data Creation

– preserving ability to read (physically) and understand (logically)

• Full Spectrum and Precise Metadata in Collection, Object and Management Level

• Workflow Support: Digital Information Life-Cycle– Create--> Content Analysis & Annotation--> IPR

Protection --> Repurposing-->Multi-modal/Integrative Search --> Archive

• Data Exploration across Institutional and Disciplinary Domains

• Petabyte Scale Storage Management with Performance

A New Information Infrastructure is Required!

Page 8: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

Why Grid is needed in multiple projects

• To conduct R&D and integration tasks to help digitize and network the collections and resources of different institutes or multiple projects.

• A Data Grid Services is necessary to provide virtualized middleware services for sharing data across distributed, heterogeneous data resources in different administrative and security domains.

Page 9: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

• Digital Archives in Taiwan demand reliable storage systems for persistent digital objects, well-organized information structure for effective content management, efficient and accurate information retrieval mechanisms, and flexible services for variant users needs.

• Grid technology enlightened a viable solution for long-term preservation and processing diversified heterogeneous Petabyte scale digital archives.

• Data Grid aims to set up a computational and data-intensive grid of resources for data analysis. It requires coordinated resource sharing, collaborative processing and analyzing on huge amounts of data produced and stored by many institutions.

Why Grid infrastructure for NDAP in general is required

Page 10: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

Workflow of Digital Archives

Page 11: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

Middleware Deployment

• The SRB system in Taiwan is used for the long-term preservation of the digital contents produced by the digital archives projects.

• The system was deployed by the Academia Sinica Grid Computing Centre (ASGC) in early 2004.

• Constituted from 8 sites in different institutes.

Page 12: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

System Architecture

Page 13: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications
Page 14: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications
Page 15: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

Interoperation

• With the nature of Grid, the most effective way to share data resources is to integrate the data sources and data grids.

• SRM for SRB interface is to make the popular SRB Data Grid System interoperable with the EGEE infrastructure.

• Support the standard SRM services for SRB 14

Page 16: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

Why SRM

15

• SRM is an unique interface for accessing diffident backend storages for diffident middleware.

• Easy to develop applications to adapt different backend storages.• Provide space and file management on the storage system.• SRM is the web service interface and the implementation usually

depends on the backend storage technology.• Grid middleware needs to access files with an uniform interface

Page 17: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

Concept

16

Page 18: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

Concept

16

Page 19: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

Architecture Overview

17

Core

SRB+DSI

Auxiliary Filecatalog

Gridftp/management API

SRM API

File transfer (gridftp)

Web Service

Data server management

Users/applications

Page 20: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

18

•User Interface

Page 21: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

18

•User Interface

•SURL

•Gridftp/management commands

Page 22: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

18

•User Interface

•SURL

•Gridftp/management commands

•Host information

•Hostname:t-ap51.grid.sinca.edu.tw•Info: AMGA server

Page 23: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

18

•User Interface •Hostname: t-ap20.grid.sinica.edu.tw•Info: SRB server (SRB-DSI installed)

•Return some information

•SURL

•Gridftp/management commands

•Host information

•Hostname:t-ap51.grid.sinca.edu.tw•Info: AMGA server

Page 24: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

18

•User Interface •Hostname: t-ap20.grid.sinica.edu.tw•Info: SRB server (SRB-DSI installed)

•Return some information

•Hostname: fct01.grid.sinica.edu.tw•The end point: httpg://fct01.grid.sinica.edu.tw:8443/axis/services/srm•Info: SRM interface

•TURL

•SURL

•Gridftp/management commands

•Host information

•Hostname:t-ap51.grid.sinca.edu.tw•Info: AMGA server

Page 25: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

List the content of a SRB directory

19

[sary357@fct01 isgc2008]$ sh SrmLs.sh

Please input the SURL you'd like to list the content:srm://fct01.grid.sinica.edu.tw:8443/axis/services/srm?/AS/home/sary357.ASGC/sary3571/

*************** SrmLs **********************

Status code: SRM_SUCCESS

Explanation: null

==========================================================

===== The individual status and result =========

The status code:SRM_SUCCESS

The explanation:null

File name:/AS/home/sary357.ASGC/sary3571

The Size:0

The File Type:DIRECTORY

The owner of this file/directory:sary357

The owner permission of this file/directory:RW

************** The sub dir and files *****************

File name:/AS/home/sary357.ASGC/sary3571/steps2.html1

CheckSumType:

The Size:9

The File Type:FILE

The owner of this file/directory:sary357

The owner permission of this file/directory:RW

File name:/AS/home/sary357.ASGC/sary3571/20080409

The Size:0

The File Type:DIRECTORY

The owner of this file/directory:sary357

Page 26: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

Putting a file

20

sary357@fct01 isgc2008]$ sh SrmPut.sh

Please input the local file name you'd like to put:/tmp/testFile1.txt

Please input the SURL you'd like to put:srm://fct01.grid.sinica.edu.tw:8443/axis/services/srm?/AS/home/sary357.ASGC/sary3571/testFile1.txt

***************SrmPrepareToPut***********************

Token:1207660686127

Status code: SRM_SUCCESS

Explanation: null

===============================================================

URI: srm://fct01.grid.sinica.edu.tw:8443/axis/services/srm?/AS/home/sary357.ASGC/sary3571/testFile1.txt | The status code: SRM_SPACE_AVAILABLE | TURL: gsiftp://t-ap20.grid.sinica.edu.tw:2811/AS/home/sary357.ASGC/sary3571/testFile1.txt

**********Try to upload the data******************

Upload TURL: gsiftp://t-ap20.grid.sinica.edu.tw:2811/AS/home/sary357.ASGC/sary3571/testFile1.txt start....

Upload TURL: gsiftp://t-ap20.grid.sinica.edu.tw:2811/AS/home/sary357.ASGC/sary3571/testFile1.txt end......

*************** SrmPutDone *********************

Status code: SRM_SUCCESS

Explanation: null

Page 27: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

Getting a file

21

[sary357@fct01 isgc2008]$ sh SrmGet.sh

Please input the SURL you'd like to get:srm://fct01.grid.sinica.edu.tw:8443/axis/services/srm?SFN=/AS/home/sary357.ASGC/111.jpg

Please input the local file name you'd like to store:/tmp/111.jpg

*************** SrmPrepareToGet **********************

The status:

Status code: SRM_SUCCESS

Explanation: null

========================================================

The individual result:

The SURL: srm://fct01.grid.sinica.edu.tw:8443/axis/services/srm?SFN=/AS/home/sary357.ASGC/111.jpg

the status code:SRM_FILE_PINNED

the explanation of this SURL:null

The TURL:gsiftp://t-ap20.grid.sinica.edu.tw:2811/AS/home/sary357.ASGC/111.jpg

************* Got the TURL ****************************

************* Download file ***************************

download from TURL: gsiftp://t-ap20.grid.sinica.edu.tw:2811/AS/home/sary357.ASGC/111.jpg start....

download from TURL: gsiftp://t-ap20.grid.sinica.edu.tw:2811/AS/home/sary357.ASGC/111.jpg end......

And the file name after downloading is /tmp/111.jpg.

*************** SrmReleaseFiles *******************

Status code: SRM_SUCCESS

Explanation: null

==========================================================

Page 28: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

Summary

• Need more interactions & sharing in LTP & data curtain experiences inside and between academy and industry.

• Provides an production quality infrastructure for distributed access to central-based data and replications.

• Make SRB an archival system of gLite-based e-infrastructure.

• Support Lifetime policy for files - volatile, durable, permanent in SRB

• Impose the same VO and security control to SRB as the Grid infrastructure

Page 29: Grid-enabled for Digital Archives: The Development and ...event.twgrid.org/isgc2008/Presentation Meterial... · Grid-enabled for Digital Archives: The Development and Applications

Many thanks for your attention.