28
EMI INFSO-RI- 261611 EMI INFSO-RI- 261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

Embed Size (px)

Citation preview

Page 1: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

EMI proposal for a Storage Accounting Record standard

Jon Kerr NilsenDept. of Physics, Univ. of Oslo

On behalf of the EMI StAR team

Page 2: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

• EMI StAR team– Paul Millar– Ralph Müller-Pfefferkorn– Zsolt Molnár– Riccardo Zappi– Jon Kerr Nilsen

• Henrik Thostrup Jensen (NDGF)

Who?

Jon Kerr Nilsen 2

Page 3: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

• Why storage accounting?• Storage accounting vs. job accounting• Processing model• Storage accounting work in EMI and OGF• StAR proposal• Prototype and first feedback• Way forward

Outline

Jon Kerr Nilsen 3

Page 4: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Why storage accounting?

• Storage resources are shared between organizations – more than one owner

• Developing a storage infrastructure– We need to know how much

storage space is used, by which group/user, on which storage media

– To know where to put the money when increasing the storage space

– To know who to ask for the money to increase the storage space

• Basis for billing– Storage is expensive – Some non-academic resource

owners may not like to give it away for free

Jon Kerr Nilsen 4

Page 5: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Accounting

• Accounting is the recording and summarizing of the consumption of a resource by an individual user or a group of users

• Typically used to find out who uses how much of a set of resources

• Typically not used to find out how individual resource components are used

Jon Kerr Nilsen 5

Page 6: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Storage vs. job accounting

Job accounting• Job accounting is the recording

and summarizing of the consumption of computing resources

• Recording wall-time, cpu-time, etc. – strictly increasing values.

• When a job finishes it stays finished

• Typically measured per job – information readily available from local batch system

• Standardized through UR1.0

Storage accounting• Storage accounting is the recording and

summarizing of the consumption of a storage resource

• Storage usage varies in time – recorded usage is only relevant in a given time frame

• Data is rarely in finished state and can live for a long time

• Typically recorded through (in)frequent usage snapshots – high fluctuation rate

• No standardized way to record storage usage in a distributed environment

Jon Kerr Nilsen 6

Page 7: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Storage vs. job summary

Jon Kerr Nilsen 7

• Storage records must be treated different than compute records wrt. aggregation

• Storage records need special treatment– Only valid between measurement time and

expiration time– Can be invalidated if newer record is

registered for same resource– Nontrivial processing model

Page 8: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Processing model

• Records can overlap in time – cannot simply sum up to identify total consumption

• Must ask for a certain point in time and sum up the most recent, still-valid records

• To get usage per time, you must choose a set of timestamps and repeat process per time stamp – numerical integration

Jon Kerr Nilsen 8

Page 9: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Processing model

• Records can overlap in time – cannot simply sum up to identify total consumption

• Must ask for a certain point in time and sum up the most recent, still-valid records

• To get usage per time, you must choose a set of timestamps and repeat process per time stamp – numerical integration

Jon Kerr Nilsen 9

Page 10: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Processing model

• Records can overlap in time – cannot simply sum up to identify total consumption

• Must ask for a certain point in time and sum up the most recent, still-valid records

• To get usage per time, you must choose a set of timestamps and repeat process per time stamp – numerical integration

Jon Kerr Nilsen 10

Page 11: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Processing model

• Records can overlap in time – cannot simply sum up to identify total consumption

• Must ask for a certain point in time and sum up the most recent, still-valid records

• To get usage per time, you must choose a set of timestamps and repeat process per time stamp – numerical integration

Jon Kerr Nilsen 11

Page 12: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Processing model

• Records can overlap in time – cannot simply sum up to identify total consumption

• Must ask for a certain point in time and sum up the most recent, still-valid records

• To get usage per time, you must choose a set of timestamps and repeat process per time stamp – numerical integration

Jon Kerr Nilsen 12

Page 13: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Processing model

• Records can overlap in time – cannot simply sum up to identify total consumption

• Must ask for a certain point in time and sum up the most recent, still-valid records

• To get usage per time, you must choose a set of timestamps and repeat process per time stamp – numerical integration

Jon Kerr Nilsen 13

Page 14: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Processing model

• Records can overlap in time – cannot simply sum up to identify total consumption

• Must ask for a certain point in time and sum up the most recent, still-valid records

• To get usage per time, you must choose a set of timestamps and repeat process per time stamp – numerical integration

Jon Kerr Nilsen 14

Page 15: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

• From EMI Description of Work:– EMI will address known UR issues and extend

accounting records to include VO-aware storage usage accounting

– The refined and extended standard UR format, service interface and data transport protocol will be implemented in gLite, ARC, dCache and UNICORE

– EMI will thus contribute to the definition of UR 2.0 of OGF

Storage accounting in EMI

Jon Kerr Nilsen 15

Page 16: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

• “Next Generation Usage Records” OGF session at OGF31– Discussed the way forward for UR– A properly defined storage record and an improved

compute record are good steps towards UR2.0– Lessons learned in storage records can be ported

back into compute records (e.g. long-lived services)– EMI will work actively to have standard ready before

end of EMI• Being a highly active participant, EMI will co-chair

the OGF UR-WG

Storage accounting in OGF

Jon Kerr Nilsen 16

Page 17: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

• EMI is finalizing a proposal for a storage accounting record (StAR)

• Contacted and received input from organizations and potential user groups– NDGF, OGF, EGI, OSG, INFN, CMS, ATLAS, …

• Current draft: StAR-EMI-tech-doc-v7.pdf• Final version to be proposed to OGF as a

standard• Will be implemented by EMI storage providers

StAR

Jon Kerr Nilsen 17

Page 18: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

• To be able to arrive to a useful record in record time we need to limit the scope

• Many interesting storage-related features to record– I/O (like in Amazon S3)– Metadata access (stat,ls)– Available space– File name– ...

• Scope of StAR is limited to consumption of storage space

What StAR is not

Jon Kerr Nilsen 18

Page 19: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

• Resource– Fields describing the system the resource was consumed on– StorageSystem, StorageShare

• Resource consumption– How much of the described resource has been used– ResourceCapacityUsed, LogicalCapacityUsed

• Consumption details– Fields describing what is consumed– StorageMedia, StorageClass, DirectoryPath, …

• Identity– Describes the person or group accountable for the consumption– SubjectIdentity block (UserIdentity, Group, GroupAttribute, …)

• Record identity and duration

StAR structure

Jon Kerr Nilsen 19

Page 20: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Local example

Jon Kerr Nilsen 20

<sr:StorageUsageRecord xmlns:sr="http://eu-emi.eu/namespaces/2011/02/storagerecord"> <sr:RecordIdentity sr:createTime="2010-11-09T09:06:52Z" sr:recordId="host.example.org/sr/87912469269276"/> <sr:StorageSystem>host.example.org</sr:StorageSystem> <sr:SubjectIdentity> <sr:LocalUser>johndoe</sr:LocalUser> </sr:SubjectIdentity> <sr:StorageMedia>ssd</sr:StorageMedia> <sr:FileCount>42</sr:FileCount> <sr:MeasureTime>2010-10-11T09:31:40Z</sr:MeasureTime> <sr:ValidDuration>PT3600S</sr:ValidDuration> <sr:ResourceCapacityUsed>913617</sr:ResourceCapacityUsed></sr:StorageUsageRecord>

Page 21: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

<sr:StorageUsageRecord xmlns:sr="http://eu-emi.eu/namespaces/2011/02/storagerecord"> <sr:RecordIdentity sr:createTime="2010-11-09T09:06:52Z" sr:recordId="host.example.org/sr/87912469269276"/> <sr:StorageSystem>host.example.org</sr:StorageSystem> <sr:SubjectIdentity> <sr:Group>binarydataproject.example.org</sr:Group> <sr:GroupAttribute sr:attributeType="subgroup">ukusers </sr:GroupAttribute> </sr:SubjectIdentity> <sr:StorageMedia>tape</sr:StorageMedia> <sr:FileCount>42</sr:FileCount> <sr:MeasureTime>2010-10-11T09:31:40Z</sr:MeasureTime> <sr:ValidDuration>PT3600S</sr:ValidDuration> <sr:ResourceCapacityUsed>913617</sr:ResourceCapacityUsed></sr:StorageUsageRecord>

Grid example

Jon Kerr Nilsen 21

Page 22: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

• NDGF already has working prototype– Using SGAS accounting service and data from NDGF

dCache• Records inserted in SGAS, parsed and put into

PostgreSQL DB• Querying data model with measure points and valid

duration doable for anyone comfortable with putting joins in queries

• Hackish record generation (simple SQL into dCache DB)

• Still limited graphing possibilities

Prototype implemented

Jon Kerr Nilsen 22

Page 23: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Prototype implemented

Jon Kerr Nilsen 24

3/30/1

1

3/31/1

1

4/1/1

1

4/2/1

1

4/3/1

1

4/4/1

1

4/5/1

1

4/6/1

1

4/7/1

1

4/8/1

10

500

1000

1500

2000

2500

3000

aliceatlas-dkatlas-noatlas-prodatlas-userbehrmanndteamops

Usa

ge (

TB

)

Page 24: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

• Information available from records has limitations (we knew that)

• Good for measuring actual usage• Good for measuring logical usage/VO• Logical usage/site/VO is more tricky• Measuring and aggregating logical usage is

not straight-forward• Free space could be useful as well but this

is tricky for same reasons as logical usage

First experiences from NDGF

Jon Kerr Nilsen 25

Page 25: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

• The NDGF dCache system is composed of seven sites, each storing a lot of data

• NDGF replicates all data which has been accessed within the last 30 days to two sites

• NDGF needs usage per VO and per site

Local usage issue

Jon Kerr Nilsen 26

Page 26: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Local usage issue

• If a file is stored on site A and site B for a VO C, then what exactly is reported as logical usage for any of the sites for that VO?

• Cannot say from record if file* is a replica or a logical file

• May need hierarchy in the record to get clear answer– More complex, less intuitive and

more error prone record– Different storage systems may have

different hierarchies – generic data model will be REALLY complicated

• We probably don’t want to go there

Jon Kerr Nilsen 27

Page 27: EMI INFSO-RI-261611 EMI proposal for a Storage Accounting Record standard Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo On behalf of the EMI StAR team

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

• Definition of StAR is a first step to establish common storage accounting record

• A step towards a common usage record (UR2.0)• Next jump (two steps in one go)

– Propose StAR to OGF to start standardization process• Will take time• Will include further discussions• May lead to changes in StAR• EMI will take active part in this process• Draft has been made available through UR-WG mailing list• Will be made available in SourceForge

– Implement StAR record into EMI storage middleware• EMI ends in two years• Want to gain user experience before end of EMI• Work already started outside EMI

• Comments and input more than welcome throughout the process!

Way forward

Jon Kerr Nilsen 28