Upload
yadid
View
35
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Information System Evolution. Enabling Grids for E-sciencE. - PowerPoint PPT Presentation
Citation preview
0
50
100
150
200
250
300
350
0 2000 4000 6000 8000 10000 12000 14000 16000
Number of Entries Modified
Nu
mb
er
of
Up
date
Cycle
s
Information System Evolution
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
2170LDAP
LDAP_ADD
LDAP_ADD
LDAP_MODIFY
Query
Merge
Update
Provider
Plugin
LDIF
New LDIF
LDIF DIFF Update LDIF
Query
The information system is a mission-critical component of the EGEE production infrastructure. It provides the detailed information about Grid services which is required to discover, select and use them during Grid related activities such as job and data management. The information system components are found throughout the infrastructure, and are especially sensitive to the information volume and query rate. As such it must be ensured that current components can meet the scalability requirements due to the growth of the infrastructure. An improved Berkley Database Information Index (BDII) [1] architecture is presented that has the potential to meet these future requirements.
The information changes in the information system were monitored by recording the modified entries during each BDII update. Over a period of 9 days the changes for 1932 update cycles were recorded, which corresponds to approximately one update cycle every 7 minutes. A graph of the number of changes per cycle can be seen above. The average number of entries modified per update cycle was 12771 which corresponds to 21.8% of the total number of entries. A further investigation was conducted to find out how often each attribute type was changed and the results can be found in the table above. 97.8% of the changes are confined to 14 attributes which is only 4% of the total attributes used. In the current implementation all the entries are transported and updated during each cycle, which is inefficient.
The new architecture for the BDII consists of a standard LDAP database which is updated by an external process. The update process obtains LDIF from a number of sources and merges them. It then compares this to the contents of the database and creates an LDIF file of the differences. This is then used to update the database. The aim of this approach is to reduce complexity within the BDII and speed up the update cycle, therefore enabling more data to be handled in a given time period. This increased efficiency can be directly seen from viewing the graph below, which shows the once minute load average before and after upgrading from BDII v4 to BDII v5.
With the information being inserted in to the resource BDIIs as modifications to the database, this opens up number of possibilities. One possibility is to use LDAP replication mechanisms to automatically propagate these changes to the higher levels in the system. This would be a possibility for the site level BDIIs and would reduce the latency between the update of the resource BDII and the site level BDII. Due to the use of the Freedom of Choice for Resources (FCR) [4] mechanism, it may not be possible to use LDAP replication technologies. To improve efficiency in this case a compressed content exchange mechanism could be employed or the FCR mechanism may need to be re-evaluated.
The Glue[2] information model version 2.0 is an official recommendation from the Open Grid Forum [3]. It consolidates over 4 years of production experience with the Glue 1.x series. A common information model is required to facilitate interoperation between Grid infrastructures, and the definition of version 2.0 in an open forum will increase its adoption by other infrastructures. Migrating the EGEE information system from Glue 1.3 to 2.0 will occur in three stages. Firstly the information system will be updated to support both versions. Secondly the information providers will be updated to produce both 1.3 and 2.0 information. Finally, applications can start migrating from using version 1.3 to 2.0. Glue 1.3 information will only be removed once applications have migrated to version 2.0.
User
Domain
Admin
Domain
Resource
Manager
ShareEnd Point
ActivityAccess
Policy
Mapping
Policy
Negotiates Share with
Provides
Manages
Runs
Defined on
Contacts
Maps User to
Has
Service
GlueCEStateTotalJobs 9.41%
GlueCEStateFreeCpus 9.52%
GlueSAStateUsedSpace 5.38%
GlueCEStateFreeJobslots 19.36%
GlueCEStateWorstResponseTime 11.79%
GlueSASateAvailableSpace 6.57%
GlueCEStateEstimatedResponseTime 12.50%
GlueCEStateRunningJobs 7.90%
GlueCEInfoTotalCpus 4.67%
GlueCEStateWaitingJobs 6.37%
GlueCEPolicyAssignedJobSlots 0.90%
GlueServiceStartTime 0.71%
GlueSAUsedOnlineSize 1.34%
GlueSAFreeOnlineSize 1.37%
1
10
100
1000
10000
100000
1000000
Sep03
Mar04
Sep04
Feb05
Apr05
Sep05
Dec05
Mar06
Oct06
Dec06
Mar07
Aug07
Jun08
Nu
mb
er
of
co
res
/jo
bs
/sit
es
No. Cores
No. Sites
No. Jobs
The graph above shows that the rate of increase with respect to the number of sites joining the infrastructure is slowing; however, for the number of cores and jobs per day it is increasing. Assuming a growth rate of 50 sites per year, by 2015 there could potentially be 550 sites. Each new site would contribute more fundamental services, users and resources. Assuming an exponential growth rate for the number of cores and computing activities (jobs), by 2015 the number of cores in the EGEE infrastructure could reach 500,000 and the number of jobs per day could reach 2 million.
References:References:
OverviewOverview BDII v5BDII v5
Improved Performance!
One minute load average before and after upgrading
Future DirectionsFuture Directions
GLUE 2.0GLUE 2.0
Has
The growth of the number of sites, cores and jobs per day
Infrastructure GrowthInfrastructure Growth
Investigation into the frequency of changesInvestigation into the frequency of changes
[1] http://twiki.cern.ch/twiki//bin/view/EGEE/BDII
[2] http://forge.gridforum.org/sf/projects/glue-wg
[3] http://www.ogf.org
[4] https://lcg-fcr.cern.ch:8443/fcr/fcr.cgi
Log Scale!
M. W. Schulz and L. Field CERN-ITAuthors:Authors: