30
Enabling Grids for E-sciencE The lightweight Grid- enabled Disk Pool Manager (DPM) Sophie Lemaitre – Jean-Philippe Baud EGEE-OSG workshop 25 June 2007

The lightweight Grid-enabled Disk Pool Manager (DPM)

  • Upload
    euclid

  • View
    52

  • Download
    2

Embed Size (px)

DESCRIPTION

The lightweight Grid-enabled Disk Pool Manager (DPM). Sophie Lemaitre – Jean-Philippe Baud EGEE-OSG workshop 25 June 2007. Agenda. DPM architecture SRMv2.2 VOMS and virtual ids What’s next ? Issues. DPM architecture. Functionality offered. - PowerPoint PPT Presentation

Citation preview

Page 1: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

The lightweight Grid-enabled Disk Pool Manager (DPM)

Sophie Lemaitre – Jean-Philippe Baud

EGEE-OSG workshop25 June 2007

Page 2: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

Agenda

• DPM architecture • SRMv2.2• VOMS and virtual ids• What’s next ?• Issues

Page 3: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

DPM architecture

Page 4: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

Functionality offered

• Management of disk space on geographically distributed disk servers

• Management of name space (including ACLs)

• Control interfaces– socket, SRM v1.0, SRM v2.1, SRM v2.2 (no srmCopy)

• Data access protocols– secure RFIO, gsiFTP, HTTPS, and to come HTTP

Page 5: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

/vo

DPM architecture

/dpm/domain

/home

DPMhead node file

DPMdisk servers

• DPM Name Server– Namespace– Authorization– Physical files location

• DPM Server– Requests queuing and processing– Space management

• SRM Servers (v1.1, v2.1, v2.2)• Disk Servers

– Physical files• Direct data transfer from/to disk server (no bottleneck)

CLI, C API, SRM-enabled

client, etc. data transfer

Page 6: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

DPM administration• Feedback from DPM administrators

– “Easy to install and configure”– “It works for us !”– “Our DPM has been running untouched for months”– “Very good online documentation”

• Intuitive commands– As similar to UNIX commands as possible– Ex: dpns-ls, dpns-mkdir, dpns-getacl, etc.

• DPM architecture is database centric– No configuration file– Support for MySQL and Oracle

• Scalability– All servers (except the DPM one) can be replicated if needed

(DNS load balancing)

Page 7: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

Platforms

• Supported platforms– SL(C)3– SL(C)4– MAC OS X

• From next release onwards– GridFTP 2 instead of GridFTP 1

• GridFTP 2 plugin– Allowed to have a cleaner implementation– Much simpler than GridFTP 1 to interface to

Page 8: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

SRMv2.2

Page 9: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

What’s new ?• SRMv2.2

– Biggest effort of last year– Required significant changes in DPM server code

• 5 new method types– Space reservation

srmReserveSpace, srmReleaseSpace, …– Namespace operations

srmMkdir, srmLs, …– Permissions and ACLs

srmSetPermission, srmGetPermission, …– Transfer functions

srmPrepareToPut, srmPerpareToGet, …– Admin functions

srmPing

Page 10: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

What’s new ?• Retention policies

– Given quality of disks, admin defines quality of service– Replica, Output, Custodial

• Access latency– Online, Nearline– Nearline will be used for BIOMED DICOM integration

• File storage type– Volatile, Permanent

• File pinning– Extend TURL lifetime (srmPrepareToGet, srmPrepareToPut)– Extend file lifetime in space (srmBringOnline)

Page 11: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

Space reservation

• Static space reservation (admin)$ dpm-reservespace --gspace 20G --lifetime Inf --group atlas --token_desc

Atlas_ESD$ dpm-reservespace --gspace 100M --lifetime 1h --group dteam/Role=lcgadmin --

token_desc LcgAd$ dpm-updatespace --token_desc myspace --gspace 5G$ dpm-releasespace --token_desc myspace

• Dynamic space reservation (user)– Defined by user on request

dpm-reservespace srmReserveSpace

– Limitation on duration and size of space reserved

Page 12: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

VOMS & Virtual Ids

Page 13: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

How to support VOMS ?

• Lightweight VOMS handling in DPM– Check that VOMS proxy signature comes from a trusted

host– For scalability reasons, we didn’t want to contact

another server for every authorization

• Why virtual ids ?– We didn’t want to use local users / groups

That admins would need to create a priori– DPM instead uses virtual ids

Stored in the DPM Name Server database Created automatically when user first connects with a valid proxy

Page 14: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

DPM virtual ids

• Each user’s DN– Is mapped to a unique virtual uid

• Each VOMS group, each VOMS role– Is mapped to a unique virtual gid

• Virtual uids / gids are created automatically– the first time a given user / group contacts the DPM

DPMName Server

(uid1, gid1)

Page 15: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

DPM virtual ids

/C=CH/O=CERN/OU=GRID/CN=Sophie Lemaitre 2268 101/C=CH/O=CERN/OU=GRID/CN=Simone Campana 7461 102

Virtual gids mapping (example)

Virtual uids mapping (example)

atlas 101atlas/Role=lcgadmin 102atlas/Role=production 103

DPMName Server

(uid1, gid1)Ex: (102, 101)

$ grid-proxy-init$ voms-proxy-init --vo atlas

Simone will be mapped to (uid, gid) = (102, 101)

DB

Page 16: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

DPM secondary groups

/C=CH/O=CERN/OU=GRID/CN=Sophie Lemaitre 2268 101/C=CH/O=CERN/OU=GRID/CN=Simone Campana 7461 102

Virtual gids mapping (example)

Virtual uids mapping (example)

atlas 101atlas/Role=lcgadmin 102atlas/Role=production 103

DPMName Server

(uid1, gid1)Ex: (102, 103, 101)

$ voms-proxy-init –vomsatlas:/atlas/Role=production

Simone will be mapped to (uid, gid, …) = (102, 103, 101)Simone still belongs to “atlas”

DB

Page 17: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

ACLs on files• DPM supports Posix ACLs based on Virtual Ids

– Access Control Lists on files and directories– Default Access Control Lists on directories: they are inherited by the sub-

directories and files under the directory

• Example– dpns-mkdir /dpm/cern.ch/home/dteam/jpb– dpns-setacl -m d:u::7,d:g::7,d:o:5 /dpm/cern.ch/home/dteam/jpb– dpns-getacl /dpm/cern.ch/home/dteam/jpb # file: /dpm/cern.ch/home/dteam/jpb # owner: /C=CH/O=CERN/OU=GRID/CN=Jean-Philippe Baud 7183 # group: dteam user::rwx group::r-x #effective:r-x other::r-x default:user::rwx default:group::rwx default:other::r-x

Page 18: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

ACLs on pools

• DPM terminology– A DPM pool is a set of filesystems on DPM disk servers

• By default, pools are generic

• Possibility to dedicate a pool to several groups– dpm-addpool --poolname poolA --group alice– dpm-addpool --poolname poolB --group atlas,cms,lhcb

• Easy to add or remove groups– dpm-modifypool --poolname poolA --group +atlas,-alice

Page 19: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

Authorization models

• Follow the UNIX model– Namespace: primary and secondary groups

– Space reservation: primary group only For disk space accounting (and quotas later) Who actually uses the space gets to pay the bill…

Page 20: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

What’s next ?

Page 21: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

What’s next ?

• Next release– DPM Name Server as local LFC

• Short term (autumn 2007)– Quotas– srmCopy daemon– Medical data management

Encryption DICOM backend

• Medium term (beginning 2008)– NFSv4.1

Page 22: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

Local LFC

• DPM Name Server– Can act as a local LFC (LCG File Catalog)

• Advantages– Only one service to run instead of two (LFC + DPM)– Transparent to the users

• Available in next release

Page 23: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

DPM quotas

• DPM terminology– A DPM pool is a set of filesystems on DPM disk servers

• Unix-like quotas– Quotas are defined per disk pool– Usage in a given pool is per DN and per VOMS FQAN– Primary group gets charged for usage– Quotas in a given pool can be defined/enabled per DN and/or

per VOMS FQAN– Quotas can be assigned by admin– Default quotas can be assigned by admin and applied to new

users/groups contacting the DPM

Page 24: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

DPM quotas

• Unix-like quota interfaces– User interface

dpns-quota gives quota and usage information for a given user/group (restricted to the own user information)

– Administrator interface dpns-quotacheck to compute the current usage on an existing

system dpns-repquota to list the usage and quota information for all

users/groups dpns-setquota to set or change quotas for a given user/group

Page 25: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

DPM with NFSv4.1

• NFSv4.1 and DPM have similar architectures– Separate metadata server– Direct access to physical files– Easy NFSv4.1 integration

Page 26: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

Encrypted StorageMedical community as the principal user• large amount of images are produced in DICOM• privacy concerns vs. processing needs• ease of use (image production and application)Strong security requirements• anonymity (patient data is separate)• fine grained access control• privacy (even storage administrator cannot read)

data is encrypted (DICOM-SE) and decrypted (client) in memory

DICOM-SE

SRMv2

gridftp

I/O

DICOM

HydraKeyStore

AMGAmetadata

1. patient look-up

3. get TURL

2. keys

4. read enc. image

HydraKeyStoreHydra

KeyStore

DICOMplug-in

3.1 get enc. image

3.1.1 keys

3.1.2 image 5. decrypt

Page 27: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

Issues

Page 28: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

Issues

• DPM stable and reliable service but…

• No NFS support yet– For several sites, reason for not moving from Classic SE to DPM

• Lack of experience with big sites

• Lack of internal monitoring– Ex1: automatically disable a file system that is down– Ex2: automatically limit the number of transfers to a disk server

• Different VO types (HEP, BIOMED, etc.)– Need to develop different features for different needs

Page 29: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

Summary

• DPM service– Manages space on distributed disks– Easy to configure and administer– Easy and transparent to use– Stable and reliable Grid service– Widely deployed

125 DPM instances in EGEE 138 VOs supported

• Short term– Quotas– NFSv4 support

DPM

dCache

CASTOR

Number of Storage Element instancespublished in EGEE top BDII

Page 30: The lightweight Grid-enabled Disk Pool Manager (DPM)

Enabling Grids for E-sciencE

Help ?

• DPM online documentationhttps://twiki.cern.ch/twiki/bin/view/LCG/DataManagementDocumentation

• Support– [email protected]

• General questions– [email protected]