24
A. Cavalli - F. Semeria INFN Experience With Globus GIS 1 A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania, 9-11 April 2001 INFN Experience With Globus GIS

A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania, 9-11 April 2001

  • Upload
    sagira

  • View
    31

  • Download
    0

Embed Size (px)

DESCRIPTION

INFN Experience With Globus GIS. A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania, 9-11 April 2001. Introduction. In a distributed environment like a Grid, one of the primary needs is to collect and retrieve resource information. - PowerPoint PPT Presentation

Citation preview

Page 1: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

1

A. Cavalli - F. SemeriaINFN

First INFN Grid WorkshopCatania, 9-11 April 2001

INFN Experience With Globus GIS

Page 2: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

2

Introduction

› In a distributed environment like a Grid, one of the primary needs is to collect and retrieve resource information.

› Whithin the Globus model the Grid Information Service (GIS) is the way of making information available to Grid application.

Page 3: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

3

The Globus GIS

› Based on LDAP directory services.

› A directory is similar to a database, but specialized in hierarchical information storage/retrieval.

Page 4: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

4

Why LDAP?

› PROs: it is a standard way to describe and collect

data. it provides a distributed topological model for

the data.

› CONs: directories are designed more for reading

than for writing. Good for a DNS, but not for storing dynamic data like the CPU load of a machine.

Page 5: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

5

Data & Schema Design

› Define useful data to publish on GIS.

› Manage in different ways dynamic & static data.

› Take advantage of flexibility of LDAP objectclasses’ schema.

› Complete the description of resources in Globus schema.

Page 6: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

6

GIS=GRIS+GIIS

Globus 1.1.3 implements the GIS by using two kinds of LDAP servers: GRIS (Grid Resource Information Service) runs

on each resource (machine). Its LDAP uses a shell backend to gather the resource configuration and status. It registers itself to a GIIS providing info about itself.

GIIS (Grid Index Information Service): LDAP server that runs on an organizational server that collects and caches information provided by GRIS’s registered under it.

Page 7: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

7

Registration & Data Collecting

› GRIS & GIIS send a registration (LDAP) to their upper Index Server every 5 mins.

› When queried, a GIIS:• scans regs• kicks expired ones• collect data from registered resources• returns collected & valid cached data

Page 8: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

8

Caching

› Information are pulled by higher level GIISes from lower GIIS/GRIS resources upon a request.

› Information are stored in cache for a period of time (TTL=Time To Live).

› Higher the level of GIIS higher the TTL, lower the details.

› Access control needed to let store only static data on higher levels.

Page 9: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

9

Page 10: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

10

Extending the GRIS

› The GRIS uses programs called information providers to collect information from the machine.

› The requirements for an information provider are: the program must emit LDIF objects to

stdout the object generated must respect the

GLOBUS schema

Page 11: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

11

Resource Discovery(proposal)

1. Top Level: get possible candidates using static data.

2. Mid Level: narrow the search on local Index Servers.

3. Resource Level: finalize the search using data available only at this level.

Page 12: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

12

Performance

› In the worst case the whole set of machines must be queried.

› Some indexing techniques should be used to implement search space pruning (currently the GIIS backend always fetches data for every registered host).

› Also a periodic information update mechanism can be investigated.

Page 13: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

13

Fault Tolerance

› Replication of Index Servers data must be implemented (for now at the root level with Netscape LDAP server).

› Replica servers could be available through a DNS based mechanism.

Page 14: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

14

Security and access policies

› In the current implementation any machine can register itself to a GIIS

› No access control when searching the GIIS. From any ldap client I can:ldapsearch –h mds.infn.it –p 389 –s sub –b “o=grid”

“objectclass=*”

and get all the information from the GIIS

Page 15: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

15

INFN implementation

› INFN has implemented a hierarchical structure of GIIS based on INFN departments (about 25).

› Each GRIS registers itself to the site GIIS which in turn registers itself to the top level INFN GIIS.

Page 16: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

16

Top level GIISdc=infn,dc=it,o=grid

GIIS Milano

GRIS

dc=mi,dc=infn,dc=it,o=grid

GIIS Bologna

GRIS

dc=bo,dc=infn,dc=it,o=grid

Page 17: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

17

MDS Browser

http://bond.cnaf.infn.it/cgi-bin/mdsbrowse1.pl

Page 18: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

18

Experiments resources

› Each GRIS can register itself to different GIIS’s.

› This allows repartitioning of resources by experiment.

Page 19: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

19

INFN GIIS

“dc=infn, dc=it, o=Grid”

MILANOGIIS

“dc=mi, dc=infn…”

BOLOGNAGIIS

“dc=bo, dc=infn…”

INFN CMS EXPERIMENTGIIS

“dc=infn, dc=it, ou=cms, o=Grid”

GRIS

CERNCERN CMS EXPERIMENT

GIIS“ou=cms, o=Grid”

Experiment resources: topology

Page 20: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

20

Some tests

› We have tested the performance dependency from caching and cpu load.

› Test have been made on WAN.› The same queries on a GIIS take:

~ 1 sec. when cache is on~ 10 sec. or more when expired.

Page 21: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

21

Some tests (2)

› When a GRIS has a loaded CPU the response time from its own GIIS is much longer when the cache is expired (> 1 min. vs 1 sec.)

› Also when a GIIS has a loaded CPU and the cache is not expired the response time is longer (6-7 sec.): it happens with GIIS also used for computation…

Page 22: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

22

Some tests (3)

We have also compared the Globus MDS with the OpenLDAP/LDBM server: with the same set of data, LDAP/LDBM response times are slower.

Page 23: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

23

Conclusions

› The Globus Information Service is based on a standard protocol (LDAP).

› It provides flexibility and a potentially good distributed data model.

› But...

Page 24: A. Cavalli - F. Semeria INFN First INFN Grid Workshop Catania,  9-11 April 2001

A. Cavalli - F. Semeria INFN Experience With Globus GIS

24

Improvements & further studies

have to be done

mainly:

› Performance:• more efficient backend• push/pull model

› Security: authentication & ACLs

› Data: more resource attributes & data typing

› Topology: more flexible than the current hierarchical-geographical one