27
GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

Embed Size (px)

DESCRIPTION

GIS=GRIS+GIIS › Globus implements the GIS by using two kinds of LDAP servers:  GRIS (Grid Resource Information Service) runs on each resource (machine). It registers itself to a GIIS providing info about itself.  GIIS (Grid Index Information Service) usually runs on few machines per organization and is a search engine for a set of GRISes

Citation preview

Page 1: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

GIIS Implementation and Requirements

F. SemeriaINFN

European Datagrid ConferenceAmsterdam, 7 March 2001

Page 2: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

Introduction

› In a distributed environment like a Grid, one of the primary needs is to collect and retrieve resource informations.

› The Grid Information Service (GIS) is the way of making informations

available to Grid application.

Page 3: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

GIS=GRIS+GIIS› Globus implements the GIS by using two

kinds of LDAP servers: GRIS (Grid Resource Information Service) runs

on each resource (machine). It registers itself to a GIIS providing info about itself.

GIIS (Grid Index Information Service) usually runs on few machines per organization and is a search engine for a set of GRISes

Page 4: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

Why LDAP?

› LDAP is a protocol for accessing directories.

› A directory is like a database, but easier to implement.

Page 5: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

Why LDAP?› PROs:

it is a standard way to describe and collect data.

it provides a distributed topological model for the data.

› CONs: directories are designed more for reading than

for writing. Good for a DNS, but not for storing dynamic data like the CPU load of a machine.

Page 6: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

General implementation

› The proposed implementation of the GIS is to have an hierarchical structure of GIIS having a root server at CERN.

› Each organization has its top level GIIS registered on the root server, but can choose its own low level topology.

Page 7: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

EU GIISo=grid

INFN (Italy)dc=infn,dc=it,o=grid

IN2P3 (France)dc=in2p3,dc=fr,o=grid

LIP (Portugal) dc=lip,dc=pt,o=grid

... ... ... ...... ...

Page 8: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

INFN implementation

› INFN has implemented a hierarchical structure of GIIS based on INFN departments (about 25)

› Each GRIS registers itself to the site GIIS which in turn registers itself to the top level INFN GIIS

Page 9: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

Top level GIISdc=infn,dc=it,o=grid

GIIS Milano

GRIS

dc=mi,dc=infn,dc=it,o=grid

GIIS Bologna

GRIS

dc=bo,dc=infn,dc=it,o=grid

Page 10: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

INFN top level GIIS

› 11 GIIS’s registered› More than 40 GRIS’s

Page 11: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001
Page 12: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

GIS Requirements› Each experiment needs to be able to

select its own set of machines (with its own name space ?)

› We need more attributes to describe the status of jobs and machines.

› Superior knowledge: referral to upper GIIS› Data replication for backup and mirroring

Page 13: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

Experiments resources

› Each GRIS can register itself to several GIIS’s.

› This allows repartitioning of resources by experiment.

Page 14: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

EU CMS GIIS exp=cms,o=grid

GIIS Milano

GRIS

GIIS Bologna

GRIS

Top level INFN GIIS dc=infn,dc=it,o=grid

Page 15: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

Jobs and machines info

› The underlying resource management systems, like Condor,LSF,PBS, provide useful information about machines and jobs that should be published in the GIS.

Page 16: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

Examples of jobs info› job id› current status of the job› the size of the executable› the name of the user› the submitting and the executing host› why the job is not running› etc.

Page 17: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

Example of machines info

› the total and available physical memory and swap space

› the speed of the machine in MIPS› the number of CPUs› the CPU load average› etc.

Page 18: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

Extending the GRIS› The GRIS uses programs called

information providers to collect information from the machine.

› The requirements for an information provider are: the program must emit LDIF objects to stdout the object generated must respect the

GLOBUS schema

Page 19: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

Caching› Information are not pushed periodically

from a GRIS to a GIIS, but is the GIIS that queries the GRISes when an application needs information.

› Information are stored in cache for a period of time (TTL=Time To Live).

› Higher the level of GIIS higher the TTL, lower the details.

Page 20: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001
Page 21: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

Performance

› In the worst case the whole set of machines must be queried.

› Some indexing techniques should be used to implement a search space pruning.

› Also a periodic information update mechanism can be investigated.

Page 22: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

Some tests

› We have tested the performance dependency from caching and cpu load.

› Test have been made on WAN.› The same queries on a GIIS take < 1 sec. when cache is on and > 10 sec. when off

Page 23: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

Some tests (cont.)› When a GRIS has a loaded CPU the

response time from its own GIIS is much higher when cache is expired (> 1 min. vs 1 sec.)

› Also when a GIIS has a loaded CPU and the cache is not expired the response time is higher (6-7 sec.): better do not use a GIIS for computation!

Page 24: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

Security and access policies

› In the current implementation any machine can register itself to a GIIS

› No access control when searching the GIIS. From any ldap client I can:

ldapsearch –p 389 –h mds.infn.it –b “o=grid” –s sub “*=*”

and get all the information from the GIIS

Page 25: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

Documentation› The documentation is currently on: www.infn.it/gridwhere there is also the pointer to:

INFN Globus documentation INFN Globus toolkits distribution INFN testbed (www.infn.it/testbed-grid)

› For testbed GIS support, mailing list: [email protected]› There will be soon a more general

documentation for Datagrid.

Page 26: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

Conclusions

› The Globus Information Service is based on a standard protocol (LDAP).

› It provides flexibility and a potentially good distributed data model.

› But...

Page 27: GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

Conclusions (cont.)› A good topology for the HEP experiments

must be still implemented.› The GRIS must be extended with new

information providers.› Lack of data replication.› Some new mechanism should be introduced

to improve performance and security.