Introduction to Grid Computing:
Ding QingSSE USTC
2
Overview
1. Background
2. Globus Toolkit
3. Future directions
4. Related tools
3
1. Background
IntroductionTowards global (Grid) computingGrid Challenges and TechnologiesGrid ArchitecturesGrid Applications
4
Introduction
5
Computing and Communication Technologies Evolution
* Sputnik
1960 1970 1975 1980 1985 1990 1995 2000
* ARPANET
* Email* Ethernet
* TCP/IP* IETF
* Internet Era * WWW Era
* Mosaic
* XML
* PC Clusters* Crays * MPPs
* Mainframes
* HTML
* W3C
* P2P
* Grids
* XEROX PARC wormCO
MP
UTIN
GC
om
mu
nic
ati
on
* Web Services
* Minicomputers * PCs
* WS Clusters
* PDAs* Workstations
* HTC
2010
* e-Science
* Computing Utility
* e-Business
* SocialNet
6
2100
2100 2100 2100 2100
2100 2100 2100 2100
Personal Device SMPs or SuperComputers
LocalCluster
GlobalGrid
PERFORMANCE
+
Q
o
S
Inter PlanetGrid
•Individual•Group•Department•Campus•State•National•Globe•Inter Planet•Universe
Administrative Barriers
EnterpriseCluster/Grid
Scalable Computing
7
Cluster of Clusters
Scheduler
MasterDaemon
ExecutionDaemon
SubmitGraphicalControl
Clients
Cluster 2
Scheduler
MasterDaemon
ExecutionDaemon
SubmitGraphicalControl
Clients
Cluster 3
Scheduler
MasterDaemon
ExecutionDaemon
SubmitGraphicalControl
Clients
Cluster 1
LAN/WAN
8
Towards global (Grid) computing
Metaphor: Applications draw computing power from a Computational Gridin the same way electrical devices draw power from an electrical grid.
http://www.sun.com/hpc/
Grid enables:
Resource Sharing
Selection
Aggreation
Grid: An Internet Computing model for coordinated resource sharing
9
A Typical Grid Computing Environment
Grid Resource Broker
Resource Broker
Application
Grid Information Service
Grid Resource Broker
databaseR2R3
RN
R1
R4
R5
R6
Grid Information Service
10
What is Grid ?(there are several definitions)
A type of parallel and distributed system that enables the sharing, selection, & aggregationof geographically distributed “autonomous” resources:
Computers – PCs, workstations, clusters, supercomputers, laptops, notebooks, mobile devices, PDA, etc;
Software – e.g., ASPs renting expensive special purpose applications on demand;
Catalogued data and databases – e.g. transparent access to human genome database;
Special devices/instruments – e.g., radio telescope – SETI@Home searching for life in galaxy.
People/collaborators.
depending on their availability, capability, cost, and user QoS requirements.
Widearea
11
Various Types of Grid Services Computational Services – CPU cycles
SETI@Home, NASA IPG, TeraGrid, I-Grid,… Data Services
Data replication, management, secure access--LHC Grid/Napster
Application Services Access to remote software/libraries and
license management—NetSolve Interaction Services
eLearning, Virtual Tables, Group Communication (Access Grid), Gaming
Knowledge Services The way knowledge is acquired and
managed—data mining. Utility Computing Services
Towards a market-based Grid computing: Leasing and delivering Grid services as ICT utilities.
Computational Grid
Data Grid
ASP Grid
Interaction Grid
Knowledge Grid
Utility Grid
12
Prominent Grid Drivers: Emerging e-Science and e-Business Apps
Next generation experiments, simulations, sensors, satellites, even people and businesses are creating a flood of data. They all involve numerous experts/resources from multiple organization in synthesis, modeling, simulation, analysis, and interpretation.
Life Sciences Digital Biology
Finance: Portfolio analysis
~PBytes/sec
Newswire & data mining:Natural language engineering
Astronomy
Internet & Ecommerce
High Energy Physics Brain Activity Analysis
Quantum Chemistry
13
E-Science Elements
Distributed instruments
Distributed computation
Distributed data
Peers sharing ideas and collaborative interpretation of data/resultsE-Scientist
2100 2100 2100 2100
2100 2100 2100 2100
Remote Visualization
Data & Compute Service
14
Molecular Docking for Drug Design
It involves screening millions of chemical compounds (molecules) in the Chemical Databases to identify those having potential to serve as drug candidates.
Protein
Molecules
Chemical Databases(legacy, in .MOL2 format)
[Collaboration with WEHI for Medical Science, Melbourne]
15LHC – High Energy Physics Collaboration
(fundamental investigation on the origin of mass)
16
LHC Grid Computing Model
Tier2 Centre ~1 TIPS
Online System
Offline Processor Farm
~20 TIPS
CERN Computer Centre
Asia Pacific Centre ~4 TIPS
France Regional Centre
US Regional Centre
Italy Regional Centre
InstituteInstituteInstituteMelbourne~0.25TIPS
Physicist desktop computers
~100 MBytes/sec
~100 MBytes/sec
~622 Mbits/sec
~10 to 100 Mbits/sec
There is a “bunch crossing” every 25 nsecs.
There are 100 “triggers” per second
Each triggered event is ~1 MByte in size
Physics data cache
~PBytes/sec
~622 Mbits/sec
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Australian Centre ~1 TIPS
~622 Mbits/sec
1 TIPS is approximately 25,000
SpecInt95 equivalents
Tier 4
Tier 0
Tier 1
Tier 2
Tier 3
17
Enterprise Computing Applications Traditional Model Grid Based Model
Email server
Webserver
Databaseserver
Appsserver
Upgrade to a new serverto handle
more users
Horizontal integration of Email, Web, Data, and Apps servers
Service Virtualization Layer & Load Balancing
18
Oracle 10g: Towards Enterprise Grid Model
Traditional (e.g., Oracle 9i) Tight/Vertical Integration
of Storage, Database, Application Hosting Server, and Application Elements
They reside on a single computing resource.
Enhancing capability means a new investment:
Replace a machine by new one or upgrade it.
Can’t leverage existing resources.
Expensive approach.
Grid Based (e.g., Oracle 10g)
Disintegration of Storage, Database, Application Hosting Server, and Application Elements
They reside on a different resources in a Grid environment.
Enhancing capability means:
Leveraging existing resources Dynamic provisioning Cost-effective approach
19
Grid Challenges and Technologies
Security
Resource Allocation & Scheduling
Data locality
Network Management
System Management
Resource Discovery
Uniform Access
Application Construction
Realizing the Grid
The Grid Architecture
21
RR
R
R
R
R
R
RR
R
Virtual Organizations• Distributed resources and people• Linked by networks, crossing admin domains• Sharing resources, common goals• Dynamic
VO-BVO-A
R
R
R
R
22
R RR
R
R
R
R
R
RRR
R
VO-A VO-B
Virtual Organizations• Distributed resources and people• Linked by networks, crossing admin domains• Sharing resources, common goals• Dynamic• Fault tolerant
23
Grid Realization Steps The integration of individual s/w & h/w components
into a combined networked resource (single system image cluster).
Low-level middleware to provide a secure and uniform access to services provided by different resources.
User-level middleware to support application development and aggregation of distributed resources.
The construction of distributed applications.
24
25
Networked Resources across Organizations
Computers Networks Data Sources Scientific InstrumentsStorage Systems
Local Resource Managers
Operating Systems Queuing Systems Internet ProtocolsLibraries & App Kernels
Distributed Resources Coupling Services
Information QoSProcess
Development Environments and Tools
Languages/Compilers Libraries Debuggers Web tools
Resource Management, Selection, and Aggregation (BROKERS)
Applications and Portals
Prob. Solving Env.Scientific…CollaborationEngineering Web enabled Apps
Trading
…
…
…
…
FABRIC
APPLICATIONS
SECURITY LAYER
Security Data
CORE MIDDLEWARE
USER LEVEL MIDDLEWARE
Monitors
Layered Grid Architecture
Major Grid Projects and Initiatives
27
Some Grid Projects & Initiatives
Australia Nimrod-G Gridbus GridSim Virtual Lab DISCWorld GrangeNet. ..etc
Europe UK eScience EU Data Grid Cactus XtremeWeb ..etc.
India I-Grid
Japan Ninf DataFarm
Korea...N*Grid
SingaporeNGP
USA AppLeS Globus Legion Sun Grid Engine NASA IPG Condor-G Jxta NetSolve AccessGrid and many more...
Cycle Stealing & .com Initiatives Distributed.net SETI@Home, …. Entropia, UD, SCS,….
Public Forums Global Grid Forum Australian Grid Forum IEEE TFCC CCGrid conference P2P conference
http://www.gridcomputing.com
28
mix-and-match
Object-oriented
Internet/partial-P2P
Network enabled Solvers
Economic-based Utility / Service-Oriented
ComputingNimrod-G
29
Overview
1. Background
2. Globus Toolkit
3. Future directions
4. Related tools
30
The Role of the Globus Toolkit
A collection of solutions to problems that come up frequently when building collaborative distributed applications
Heterogeneity A focus, in particular, on overcoming
heterogeneity for application developers Standards
We capitalize on and encourage use of existing standards (IETF, W3C, OASIS, GGF)
GT also includes reference implementations of new/proposed standards in these organizations
31
Layers in the Grid
32
A Typical eScience Use of Globus:Network for Earthquake Eng. Simulation
Links instruments, data, computers, people
33
Without the Globus Toolkit
WebBrowser
ComputeServer
DataCatalog
DataViewer
Tool
Certificateauthority
ChatTool
CredentialRepository
WebPortal
ComputeServer
Resources implement standard access & management interfaces
Collective services aggregate &/or
virtualize resources
Users work with client applications
Application services organize VOs & enable
access to other services
Databaseservice
Databaseservice
Databaseservice
SimulationTool
Camera
Camera
TelepresenceMonitor
RegistrationService
A
B
C
D
E
Application Developer
10
Off the Shelf
12
Globus Toolkit
0
Grid Community
0
34
With the Globus Toolkit
WebBrowser
ComputeServer
GlobusMCS/RLS
DataViewer
Tool
CertificateAuthority
CHEF ChatTeamlet
MyProxy
CHEF
ComputeServer
Resources implement standard access & management interfaces
Collective services aggregate &/or
virtualize resources
Users work with client applications
Application services organize VOs & enable
access to other services
Databaseservice
Databaseservice
Databaseservice
SimulationTool
Camera
Camera
TelepresenceMonitor
Globus IndexService
GlobusGRAM
GlobusGRAM
GlobusDAI
GlobusDAI
GlobusDAI
Application Developer
2
Off the Shelf
9
Globus Toolkit
4
Grid Community
4
35
The Globus Toolkit:“Standard Plumbing” for the Grid
Not turnkey solutions, but building blocks & tools for application developers & system integrators Some components (e.g., file transfer) go farther than
others (e.g., remote job submission) toward end-user relevance
Easier to reuse than to reinvent Compatibility with other Grid systems comes for free
Today the majority of the GT public interfaces are usable by application developers and system integrators Relatively few end-user interfaces In general, not intended for direct use by end users
(scientists, engineers, marketing specialists)
36
The Application-Infrastructure Gap
Dynamicand/or
DistributedApplications
A
1
B
1
99
Shared Distributed Infrastructure
37
Provisioning
Bridging the Gap:Grid Infrastructure
Service-oriented Gridinfrastructure Provision physical
resources to support application workloads
ApplnService
ApplnService
Users
Workflows
Composition
Invocation
Service-oriented applications Wrap applications as
services Compose applications
into workflows
38
Grid Infrastructure
Distributed management Of physical resources Of software services Of communities and their policies
Unified treatment Build on Web services framework Use WS-RF, WS-Notification (or WS-
Transfer/Man) to represent/access state Common management
abstractions & interfaces
39Globus is Open Source
Grid Infrastructure
Implement key Web services standards State, notification, security, …
Software for Grid infrastructure Service-enable new & existing resources E.g., GRAM on computer, GridFTP on storage system,
custom application services Uniform abstractions & mechanisms
Tools to build applications that exploit Grid infrastructure Registries, security, data management, …
Enabler of a rich tool & service ecosystem
40
An eBusiness Use of Globus:SAP Demonstration @ GlobusWorld
3 Globus-enabled applns: CRM: Internet Pricing Configurator (IPC) CRM: Workforce
Management (WFM) SCM: Advanced Planner
& Optimizer (APO) Applications modified to:
Adjust to varying demand & resources
Use Globus to discover & provision resources
IPCDispatcher
IPCServerRequest:
Price Query
Delegation ofRequest
Response: PricelistDepending on: - Time - Discount - Number of Items - …
Web Browsers / Batch Processes(typically several thousand requests)
IPCServer
1
2
2
3
SAP AG R/3 Internet Pricing & Configurator (IPC)
41
Overview
Background and Globus approach Globus Toolkit Future directions Related tools
42
The Globus Toolkit is a Collection of Components
A set of loosely-coupled components, with: Services and clients Libraries Development tools
GT components are used to build Grid-based applications and services GT can be viewed as a Grid SDK
GT components can be categorized across two different dimensions By broad domain area By protocol support
43
GT Domain Areas
Core runtime Infrastructure for building new services
Security Apply uniform policy across distinct systems
Execution management Provision, deploy, & manage services
Data management Discover, transfer, & access large data
Monitoring Discover & monitor dynamic services
44
GT Protocols
Web service protocols WSDL, SOAP WS Addressing, WSRF, WSN WS Security, SAML, XACML WS-Interoperability profile
Non Web service protocols Standards-based, such as GridFTP Custom
45
“ Stateless” vs. “Stateful” Services
Without state, how does client: Determine what happened (success/failure)? Find out how many files completed? Receive updates when interesting events arise? Terminate a request?
Few useful services are truly “stateless”, but WS interfaces alone do not provide built-in support for state
Client
FileTransferService
move (A to B)move
46
FileTransferService (without WSRF)
Developer reinvents wheel for each new service Custom management and identification of state: transferID Custom operations to inspect state synchronously
(whatHappen) and asynchronously (tellMeWhen) Custom lifetime operation (cancel)
Client
FileTransferService
move (A to B) : transferIDmove
statewhatHappen
tellMeWhen
cancel
47
WSRF in a Nutshell Service State representation
Resource Resource Property
State identification Endpoint Reference
State Interfaces GetRP, QueryRPs,
GetMultipleRPs, SetRP Lifetime Interfaces
SetTerminationTime ImmediateDestruction
Notification Interfaces Subscribe Notify
ServiceGroups
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
48
FileTransferService (w/ WSRF)
Developer specifies custom method to createResource and leaves the rest to WSRF standards:
State exposed as Resource + Resource Properties and identified by Endpoint Reference (EPR)
State inspected by standard interfaces (GetRP, QueryRPs) Lifetime management by standard interfaces (Destroy)
ClientFileTransferService
createResource (A to B) : EPRcreateResource
RPs
Transfer getRP
queryRPs
destroy
Data MgmtSecurityCommonRuntime
Execution Mgmt
Info Services
Web Services
Components
Non-WS Components
Pre-WSAuthenticationAuthorization
GridFTPPre-WS
Grid ResourceAlloc. & Mgmt
Pre-WSMonitoring
& Discovery
C CommonLibraries
AuthenticationAuthorization
ReliableFile
Transfer
Data Access& Integration
Grid ResourceAllocation &
ManagementIndex
Java WS Core
CommunityAuthorization
ReplicaLocation
eXtensibleIO (XIO)
CredentialMgmt
CommunitySchedulingFramework
Delegation
Globus Toolkit version 4 (GT4)
DataReplication
TriggerC
WS Core
Python WS Core
WebMDS
WorkspaceManagement
Grid Telecontrol
Protocol
Contrib/Preview
Core
Depre-cated
www.globus.org
50
Data Mgmt
SecurityCommonRuntime
Execution Mgmt
Info Services
GridFTPAuthenticationAuthorization
ReliableFile
Transfer
Data Access& Integration
Grid ResourceAllocation &
ManagementIndex
CommunityAuthorization
DataReplication
CommunitySchedulingFramework
Delegation
ReplicaLocation
Trigger
Java Runtime
C Runtime
Python Runtime
WebMDS
WorkspaceManagement
Grid Telecontrol
Protocol
Globus Toolkit v4www.globus.org
CredentialMgmt
Globus Toolkit: Open Source Grid Infrastructure
51
Java Services in Apache AxisPlus GT Libraries and Handlers
YourJava
Service
YourPythonService
YourJava
Service RF
T
GR
AM
Del
egat
ion
Inde
x
Trig
ger
Arc
hive
r
pyGlobusWS Core
YourC
Service
C WS Core
RLS
Pre
-WS
MD
S
CA
S
Pre
-WS
GR
AM
Sim
pleC
A
MyP
roxy
OG
SA
-DA
I
GT
CP
Grid
FT
P
C Services using GT Libraries and Handlers
SERVER
CLIENT
InteroperableWS-I-compliant
SOAP messaging
YourJavaClient
YourC
Client
YourPythonClient
YourJavaClient
YourC
Client
YourPythonClient
YourJavaClient
YourC
Client
YourPythonClient
YourJavaClient
YourC
Client
YourPythonClient
X.509 credentials =common authentication
Python hosting, GT Libraries
GT4 Components
52
Goals for GT4
Usability, reliability, scalability, … Web service components have quality equal or
superior to pre-WS components Documentation at acceptable quality level
Consistency with latest standards (WS-*, WSRF, WS-N, etc.) and Apache platform WS-I Basic Profile compliant WS-I Basic Security Profile compliant
New components, platforms, languages And links to larger Globus ecosystem
53
Data Mgmt
SecurityCommonRuntime
Execution Mgmt
Info Services
GridFTPAuthenticationAuthorization
ReliableFile
Transfer
Data Access& Integration
Grid ResourceAllocation &
ManagementIndex
CommunityAuthorization
DataReplication
CommunitySchedulingFramework
Delegation
ReplicaLocation
Trigger
Java Runtime
C Runtime
Python Runtime
WebMDS
WorkspaceManagement
Grid Telecontrol
Protocol
Globus Toolkit v4www.globus.org
CredentialMgmt
Globus Toolkit: Open Source Grid Infrastructure
54
GT4 Web Services Runtime
Supports both GT (GRAM, RFT, Delegation, etc.) & user-developed services
Redesign to enhance scalability, modularity, performance, usability
Leverages existing WS standards WS-I Basic Profile: WSDL, SOAP, etc. WS-Security, WS-Addressing
Adds support for emerging WS standards WS-Resource Framework, WS-Notification
Java, Python, & C hosting environments Java is standard Apache
55
GT4 WS Core in a Nutshell
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
Implementation of WSRF: Resources,
EndpointReferences, ResourceProperties
Operation Providers: pre-build implementations of
WSRF operations
Notification implementation: Topics, TopicSet, Embedded
Notification Consumer service
Implementations of Resources (ReflectionResource,
PersistentReflectionResource) and ResourceProperties
(SimpleResourceProperty, ReflectionResourceProperty)
57
Service Container
GT4 WS Core in a Nutshell
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
Service Container: host multiple services in container; one JVM
process
…more details: based on AXIS service
container, processes SOAP messages, ResourceContext
extension.
58
Service Container
GT4 WS Core in a Nutshell
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
Secure Communication: Transport, Message,
Conversation (Transport demonstrates best
performance)
PIP
PDP
Configurable Security Policies: Policy Information
Points (PIPs), Policy Decision Points (PDP) -- chained
Example authorization PDPs: GridMap, SAML
implementations,XACML policies
59
Service Container
GT4 WS Core in a Nutshell
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
PIP
PDP
WorkManager DB Conn Pool JNDI Directory
WorkManager: “thread pool”, site independent
“work” manager
Apache Database Connection Pool library
(JDBC “DataSource” implementation)
JNDI Directory: manages internal, shared objects
(ResourceHomes, WorkManager,
Configuration objects,…)
60
Apache Tomcat
Service Container
GT4 WS Core in a Nutshell
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
PIP
PDP
WorkManager DB Conn Pool JNDI Directory
Deploy Service Container “standalone”
or within Apache Tomcat
61
CustomWeb
ServicesWS-Addressing, WSRF,
WS-Notification
CustomWSRF Web
Services
GT4WSRF Web
Services
WSDL, SOAP, WS-Security
User Applications
Reg
istr
yA
dmin
istr
atio
n
GT
4 C
onta
iner
GT4 Web Services Runtime
62
StatefulEntities
Registry
Service requestor (e.g., user application)
Factoryservice
Create Stateful Entity
State Address
Resource allocation
RegisterStateful
Entity
Discovery
Interactions standardized using WSDL and SOAP
State inspection Lifetime mgmt Notifications
Authentication & Authorization are applied to all requests
Modeling State in Web Services
63
WSRF & WS-Notification Naming and bindings (basis for virtualization)
Every resource can be uniquely referenced, and has one or more associated services for interacting with it
Lifecycle (basis for fault resilient state mgmt) Resources created by services following factory pattern Resources destroyed immediately or scheduled
Information model (basis for monitoring, discovery) Resource properties associated with resources Operations for querying and setting this info Asynchronous notification of changes to properties
Service groups (basis for registries, collective svcs) Group membership rules & membership management
Base Fault type
64
Data Mgmt
SecurityCommonRuntime
Execution Mgmt
Info Services
GridFTPAuthenticationAuthorization
ReliableFile
Transfer
Data Access& Integration
Grid ResourceAllocation &
ManagementIndex
CommunityAuthorization
DataReplication
CommunitySchedulingFramework
Delegation
ReplicaLocation
Trigger
Java Runtime
C Runtime
Python Runtime
WebMDS
WorkspaceManagement
Grid Telecontrol
Protocol
Globus Toolkit v4www.globus.org
CredentialMgmt
Globus Toolkit: Open Source Grid Infrastructure
65
Globus Security
Control access to shared services Address autonomous management, e.g.,
different policy in different work-groups Support multi-user collaborations
Federate through mutually trusted services Local policy authorities rule
Allow users and application communities to set up dynamic trust domains Personal/VO collection of resources working
together based on trust of user/VO
66
Organization A Organization B
Compute Server C1Compute Server C2
Compute Server C3
File server F1 (disks A and B)
Person C(Student)
Person A(Faculty)
Person B(Staff) Person D
(Staff)Person F(Faculty)
Person E(Faculty)
Virtual Community C
Person A(Principal Investigator)
Compute Server C1'
Person B(Administrator)
File server F1 (disk A)
Person E(Researcher)
Person D(Researcher)
Virtual Organization (VO) Concept
VO for each application or workload Carve out and configure resources for a particular
use and set of users
67
GT4 Security
VO
RightsUsers
Rights’
ComputeCenter
Access
Services (runningon user’s behalf)
Rights
Local policyon VO identityor attributeauthority
CAS or VOMSissuing SAMLor X.509 ACs
SSL/WS-Securitywith ProxyCertificates
Authz Callout:SAML, XACML
KCA
MyProxy
68
GT4 Security Public-key-based authentication Extensible authorization framework based on Web
services standards SAML-based authorization callout
As specified in GGF OGSA-Authz WG
Integrated policy decision engine XACML policy language, per-operation policies, pluggable
Credential management service MyProxy (One time password support)
Community Authorization Service Standalone delegation service
69
GT4’s Use of Security Standards
Supported, Supported, Fastest, but slow but insecure so default
70
GT-XACML Integration
eXtensible Access Control Markup Language OASIS standard, open source implementations
XACML: sophisticated policy language Globus Toolkit ships with XACML runtime
Included in every client and server built on GT Turned-on through configuration
… that can be called transparently from runtime and/or explicitly from application …
… and we use the XACML-”model” for our Authz Processing Framework
71
GT Authorization Framework
72
Other Security Services Include …
MyProxy Simplified credential management Web portal integration Single-sign-on support
KCA & kx.509 Bridging into/out-of Kerberos domains
SimpleCA Online credential generation
PERMIS Authorization service callout
73
Example: Globus Security Architecture
Diagram of Globus security architecture.
74
Data Mgmt
SecurityCommonRuntime
Execution Mgmt
Info Services
GridFTPAuthenticationAuthorization
ReliableFile
Transfer
Data Access& Integration
Grid ResourceAllocation &
ManagementIndex
CommunityAuthorization
DataReplication
CommunitySchedulingFramework
Delegation
ReplicaLocation
Trigger
Java Runtime
C Runtime
Python Runtime
WebMDS
WorkspaceManagement
Grid Telecontrol
Protocol
Globus Toolkit v4www.globus.org
CredentialMgmt
Globus Toolkit: Open Source Grid Infrastructure
75
GT4 Data Management Stage/move large data to/from nodes
GridFTP, Reliable File Transfer (RFT) Alone, and integrated with GRAM
Locate data of interest Replica Location Service (RLS)
Replicate data for performance/reliability Distributed Replication Service (DRS)
Provide access to diverse data sources File systems, parallel file systems, hierarchical
storage: GridFTP Databases: OGSA DAI
76
GridFTP in GT4 100% Globus code
No licensing issues Stable, extensible
IPv6 Support XIO for different transports Striping multi-Gb/sec wide area transport
27 Gbit/s on 30 Gbit/s link Pluggable
Front-end: e.g., future WS control channel Back-end: e.g., HPSS, cluster file systems Transfer: e.g., UDP, NetBLT transport
Bandwidth Vs Striping
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
0 10 20 30 40 50 60 70
Degree of Striping
Ba
nd
wid
th (
Mb
ps
)
# Stream = 1 # Stream = 2 # Stream = 4
# Stream = 8 # Stream = 16 # Stream = 32
Disk-to-disk onTeraGrid
77Reliable File Transfer:Third Party Transfer
RFT Service
RFT Client
SOAP Messages
Notifications(Optional)
DataChannel
Protocol Interpreter
MasterDSI
DataChannel
SlaveDSI
IPCReceiver
IPC Link
MasterDSI
Protocol Interpreter
Data Channel
IPCReceiver
SlaveDSI
Data Channel
IPC Link
GridFTP Server GridFTP Server
Fire-and-forget transfer Web services interface Many files & directories Integrated failure recovery Has transferred 900K files
78
Replica Location Service
Identify location of files via logical to physical name map
Distributed indexing of names, fault tolerant update protocols
GT4 version scalable & stable
Managing ~40 million files across ~10 sites
IndexIndex
Local DB
Update send (secs)
Bloom filter
(secs)
Bloom filter (bits)
10K <1 2 1 M
1 M 2 24 10 M
5 M 7 175 50 M
79
Cardiff
AEI/Golm
Birmingham•
Reliable Wide Area Data Replication
Replicating >1 Terabyte/day to 8 sites>30 million replicas so farMTBF = 1 month
LIGO Gravitational Wave Observatory
www.globus.org/solutions
80
OGSA-DAI
Provide service-based access to structured data resources as part of Globus
Specify a selection of interfaces tailored to various styles of data access—starting with relational and XML
81
MySQL
OGSA-DAI service
Engine
SQLQuery
JDBCData
Resources
Activities
DB2
The OGSA-DAI Framework
GZip GridFTPXPath
XMLDB
XIndice
readFile
File
SWISSPROT
XSLT
SQLServer
Data-bases
ApplicationApplicationClient ToolkitClient Toolkit
82
MySQL
OGSA-DAI service
Engine
SQLQuery
JDBC
SQL
JDBC
SQL
JDBC
SQL
JDBC
SQL
JDBC
MultipleSQL GDS
SQLQuery
Extensibility Example
83OGSA-DAI: A Framework for Building Applications
Supports data access, insert and update Relational: MySQL, Oracle, DB2, SQL Server, Postgres XML: Xindice, eXist Files – CSV, BinX, EMBL, OMIM, SWISSPROT,…
Supports data delivery SOAP over HTTP FTP; GridFTP E-mail Inter-service
Supports data transformation XSLT ZIP; GZIP
Supports security X.509 certificate based security
84
OGSA-DAI: Other Features
A framework for building data clients Client toolkit library for application developers
A framework for developing functionality Extend existing activities, or implement your own Mix and match activities to provide functionality
you need Highly extensible
Customise our out-of-the-box product Provide your own services, client-side support,
and data-related functionality
85
Data Mgmt
SecurityCommonRuntime
Execution Mgmt
Info Services
GridFTPAuthenticationAuthorization
ReliableFile
Transfer
Data Access& Integration
Grid ResourceAllocation &
ManagementIndex
CommunityAuthorization
DataReplication
CommunitySchedulingFramework
Delegation
ReplicaLocation
Trigger
Java Runtime
C Runtime
Python Runtime
WebMDS
WorkspaceManagement
Grid Telecontrol
Protocol
Globus Toolkit v4www.globus.org
CredentialMgmt
Globus Toolkit: Open Source Grid Infrastructure
86
Execution Management (GRAM)
Common WS interface to schedulers Unix, Condor, LSF, PBS, SGE, …
More generally: interface for process execution management Lay down execution environment Stage data Monitor & manage lifecycle Kill it, clean up
A basis for application-driven provisioning
87
GT4 WS GRAM
2nd-generation WS implementation optimized for performance, flexibility, stability, scalability
Streamlined critical path Use only what you need
Flexible credential management Credential cache & delegation service
GridFTP & RFT used for data operations Data staging & streaming output Eliminates redundant GASS code
88
GRAMservices
GT4 Java Container
GRAMservices
Delegation
RFT FileTransfer
Transferrequest
GridFTPRemote storage element(s)
Localscheduler
Userjob
Compute element
GridFTP
sudo
GRAMadapter
FTPcontrol
Local job control
Delegate
FTP data
Cli
ent Job
functions
Delegate
Service host(s) and compute element(s)
GT4 WS GRAM Architecture
SEGJob events
89
GRAMservices
GT4 Java Container
GRAMservices
Delegation
RFT FileTransfer
Transferrequest
GridFTPRemote storage element(s)
Localscheduler
Userjob
Compute element
GridFTP
sudo
GRAMadapter
FTPcontrol
Local job control
Delegate
FTP data
Cli
ent Job
functions
Delegate
Service host(s) and compute element(s)
GT4 WS GRAM Architecture
SEGJob events
Delegated credential can be:Made available to the application
90
GRAMservices
GT4 Java Container
GRAMservices
Delegation
RFT FileTransfer
Transferrequest
GridFTPRemote storage element(s)
Localscheduler
Userjob
Compute element
GridFTP
sudo
GRAMadapter
FTPcontrol
Local job control
Delegate
FTP data
Cli
ent Job
functions
Delegate
Service host(s) and compute element(s)
GT4 WS GRAM Architecture
SEGJob events
Delegated credential can be:Used to authenticate with RFT
91
GRAMservices
GT4 Java Container
GRAMservices
Delegation
RFT FileTransfer
Transferrequest
GridFTPRemote storage element(s)
Localscheduler
Userjob
Compute element
GridFTP
sudo
GRAMadapter
FTPcontrol
Local job control
Delegate
FTP data
Cli
ent Job
functions
Delegate
Service host(s) and compute element(s)
GT4 WS GRAM Architecture
SEGJob events
Delegated credential can be:Used to authenticate with GridFTP
92
WS GRAM Performance
Time to submit a basic GRAM job Pre-WS GRAM: < 1 second WS GRAM: 2 seconds
Concurrent jobs Pre-WS GRAM: 300 jobs WS GRAM: 32,000 jobs
Various studies are underway to test latest software
93
GT4 WS GRAM Performance
Number of Client Threads (M)
1 2 4 8 16 32 64 128
1 7 15 29 57 80 69 69 70
2 15 29 58 79 74 70 70 64
4 29 58 78 77 68 69 52 69
8 59 77 77 72 65 27 69
16 77 77 75 64 27 50
32 76 75 68 64 67
64 75 73 70 66 65
128 80 72 64 63 71
All numbers are simple jobs/minute, no delegation or staging
Su
sta
ined
Job
Load
P
er
Clien
t Th
read
(N
)
94
Workspace Service:The Hosted Activity
Policy
Client
Environment
Activity
Allocate/provisionConfigure
Initiate activityMonitor activityControl activity
Interface Resource provider
95
Activities Can Be Nested
Policy
Client
Environment
Interface Resource provider
ClientClient
96
For Example …
Physical machineProcure hardware
Hypervisor/OS Deploy hypervisor/OS
VM VM Deploy virtual machine
Provisioning, management, and monitoring at all levels
JVM Deploy container
JVM Deploy service
97
Dynamic Service Deployment
CommunityA
CommunityZ
…
• Community scheduling logic• Data distribution• Community management• Science services• ...
Requirements:• Community control• Persistence• Resource guarantees• Non- interference
98
Virtual Machine Costs
GRAM job
GRAM job in paused VM
Job in booted VM
99
Virtual OSG Clusters
OSG cluster
Xen hypervisors
TeraGrid cluster
OSG
100
Data Mgmt
SecurityCommonRuntime
Execution Mgmt
Info Services
GridFTPAuthenticationAuthorization
ReliableFile
Transfer
Data Access& Integration
Grid ResourceAllocation &
ManagementIndex
CommunityAuthorization
DataReplication
CommunitySchedulingFramework
Delegation
ReplicaLocation
Trigger
Java Runtime
C Runtime
Python Runtime
WebMDS
WorkspaceManagement
Grid Telecontrol
Protocol
Globus Toolkit v4www.globus.org
CredentialMgmt
Globus Toolkit: Open Source Grid Infrastructure
101
Monitoring and Discovery
“ Every service should be monitorable and discoverable using common mechanisms” WSRF/WSN provides those mechanisms
A common aggregator framework for collecting information from services, thus: MDS-Index: Xpath queries, with caching MDS-Trigger: perform action on condition (MDS-Archiver: Xpath on historical data)
Deep integration with Globus containers & services: every GT4 service is discoverable GRAM, RFT, GridFTP, CAS, …
102
GT4 Container
GT4 Monitoring & Discovery
GRAM User
MDS-Index
GT4 Cont.
RFT
MDS-Index
GT4 Container
MDS-Index
GridFTP
adapter
Registration &WSRF/WSN Access
Custom protocolsfor non-WSRF entities
Clients(e.g., WebMDS)
Automatedregistrationin container
WS-ServiceGroup
103
Index Server Performance
As the MDS4 Index grows, query rate and response time both slow, although sublinearly
Response time slows due to increasing data transfer size Full Index is being returned Response is re-built for every query
Real question – how much over simple WS-N performance?
104
Information Providers
GT4 information providers collect information from some system and make it accessible as WSRF resource properties
Growing number of information providers Ganglia, CluMon, Nagios SGE, LSF, OpenPBS, PBSPro, Torque
Many opportunities to build additional ones E.g., network monitoring, storage systems,
various sensors
105
Java Services in Apache AxisPlus GT Libraries and Handlers
YourJava
Service
YourPythonService
YourJava
Service RF
T
GR
AM
Del
egat
ion
Inde
x
Trig
ger
Arc
hive
r
pyGlobusWS Core
YourC
Service
C WS Core
RLS
Pre
-WS
MD
S
CA
S
Pre
-WS
GR
AM
Sim
pleC
A
MyP
roxy
OG
SA
-DA
I
GT
CP
Grid
FT
P
C Services using GT Libraries and Handlers
SERVER
CLIENT
InteroperableWS-I-compliant
SOAP messaging
YourJavaClient
YourC
Client
YourPythonClient
YourJavaClient
YourC
Client
YourPythonClient
YourJavaClient
YourC
Client
YourPythonClient
YourJavaClient
YourC
Client
YourPythonClient
X.509 credentials =common authentication
Python hosting, GT Libraries
GT4 Summary
GT4 Documentation
is Much Improved!
107
Overview
1. Background
2. Globus Toolkit
3. Future directions
4. Related tools
108
The Future:Content
We now have a solid and extremely powerful Web services base
Next, we will build an expanded open source Grid infrastructure Virtualization New services for provisioning, data management,
security, VO management End-user tools for application development Etc., etc.
And of course responding to user requests for other short-term needs
109
The Future
We now have a solid and extremely powerful Web services base
Next, we will build an expanded open source Grid infrastructure Virtualization New services for provisioning, data management,
security, VO management End-user tools for application development Etc., etc.
And of course responding to user requests for other short-term needs
110
Short-Term Priorities: Security
Improve GSI error reporting & diagnostics Secure password, one-time password,
Kerberos support for initial log on Trust roots, use of GridLogon Identity/attribute assertions in GT auth.
callouts (e.g., Shib, PERMIS, VOMS, SAML) Extend CAS admin & policy support Security logging with management control
for audit purposes
111
Short-Term Priorities: Data Management
Space & bandwidth management in GridFTP
Concurrency in globus-url-copy Priorities in RFT Data replication service Enhance policy support in data services Physical file name creation service Scalable & distributed metadata manager
112
Short-Term Priorities: Execution Management
Implement GGF JSDL once finalized Advance reservation support Policy-driven restart of “persistent” jobs Improved information collection for jobs Improved management of job collections Credential refresh Development of workspace service Integration of virtual machines (Xen, VMware) and
associated services Windows port of WS GRAM
113
Short-Term Priorities: Information Services
Many more information sources, including gateways to other systems
Automated configuration of monitoring Specialized monitoring displays Performance optimization of registry Archiver service Helper tools to streamline integration of
new information sources
114
Short-Term Priorities: WS Core
Streamlined container configuration Remote management interface Dynamic service deployment Service isolation: multiple service instances WS-Notification, subscription performance Full functionality in C WS Core Optimized WS-ServiceGroup support WS-SecureConversation support
115
Overview
Background Globus Toolkit Future directions Related tools
116
The Globus Ecosystem
Globus components address core issues relating to resource access, monitoring, discovery, security, data movement, etc. GT4 being the latest version
A larger Globus ecosystem of open source and proprietary components provide complementary components A growing list of components
These components can be combined to produce solutions to Grid problems We’re building a list of such solutions
117
Many Tools Build on, or Can Contribute to, GT4-Based Grids
Condor-G, DAGman MPICH-G2 GRMS Nimrod-G Ninf-G Open Grid Computing Env. Commodity Grid Toolkit GriPhyN Virtual Data System Virtual Data Toolkit GridXpert Synergy
Platform Globus Toolkit VOMS PERMIS GT4IDE Sun Grid Engine PBS scheduler LSF scheduler GridBus TeraGrid CTSS NEES IBM Grid Toolbox …
118Documenting
The Grid Ecosystem
The Grid Ecosystem: Software Components for Grid SystemsAnd Applications
www.grids-center.org
119
Example Solutions
Portal-based User Reg. System (PURSE) VO Management Registration Service Service Monitoring Service TeraGrid TGCP Tool Lightweight Data Replicator GriPhyN Virtual Data System
120
Condor-G
The Condor Project @ U Wisconsin Madison develops software for high-throughput computing on collections of distributed compute resources
Condor-G is an interface to GRAM created by the Condor team that allows users to submit jobs to GRAM servers
121
GridShib Allows the use of Shibboleth-transported
attributes for authorization in GT4 deployments And, more generally, SAML support
2 year project started December 1, 2004 Participants
Von Welch, UIUC/NCSA (PI) Kate Keahey, UChicago/Argonne (PI) Frank Siebenlist, Argonne Tom Barton, UChicago
Beta software released September 16, 2005
122
Handle System
The Handle System from CNRI (http://www.handle.net) is a general-purpose global name service enabling secure name resolution over the internet
The Handle System-GT Integration Project leverages the Handle System for identifier and resolution services through tight integration with GT4’s Web services protocols
123
MPICH-G2
MPICH-G2, developed at Northern Illinois University and Argonne National Lab, is a grid-enabled implementation of the MPI v1.1 standard
MPICH-G2 is implemented using the pre-WS GRAM component in GT4; integration with GT4 WS GRAM is expected in the near future
124
Nimrod/G
Nimrod is a specialized parametric modeling system from Monash University
Nimrod/G uses a simple declarative parametric modeling language to express parameter sweep experiments. Based on GT4 WS services, Nimrod/G enables the formulation, execution and monitoring of multiple individual parametric experiments
125
Ninf-G4
Ninf-G4, from AIST, is a reference implementation of the GGF standard GridRPC API
Ninf-G4 is provides higher-level programming APIs for the development and execution of parallel applications on the Grid
126
PERMIS
PERMIS is an EU-funded Privilege Management service that implements Role-Based Access Control
Thanks to the work of the UK Grid Engineering Task Force, services running in a Java WS Core container can use PERMIS via GT4’s SAML authorization callouts
127
SRB
SRB is a package from SDSC providing a uniform interface for connecting to network-based heterogeneous data resources
GT4’s GridFTP includes an interface to SRB data sources, and vice versa
128
Sun Grid Engine
Sun Grid Engine is an open source distributed resource management system from Sun Microsystems
In a collaboration between the London e-Science Centre, Gridwise and MCNC, the Sun Grid Engine has been integrated with GT4
129
Thank Thank you?you?