Upload
sheryl-jennings
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
The Gridand the Future of Business
Ian FosterMathematics and Computer Science Division
Argonne National Laboratory
and
Department of Computer Science
The University of Chicago
http://www.mcs.anl.gov/~foster
Grid Computing
Technical Universe, Circa 1890
• Ubiquitous communication infrastructure– The telegraph
• Local power generation– Electric power generators serve
at most city blocks
Technical Universe, Circa 2000
• Ubiquitous comms infrastructure– Internet, email, Web
• Ubiquitous power distribution– (Reliable?) standard access– Tremendous variety of devices
• Local computing– Most computing and storage on
internal enterprise computers
Exponentials (and Coefficients)• Network vs. computer performance
– Computer speed doubles every 18 months– Network speed doubles every 9 months– Difference = order of magnitude per 5 years
• 1986 to 2000– Computers: x 500– Networks: x 340,000
• 2001 to 2010– Computers: x 60– Networks: x 4000
Scientific American (Jan-2001)
Therefore: A Computing Grid
• On-demand, ubiquitous access to computing, data, and services
• New capabilities constructed dynamically and transparently from distributed services
“When the network is as fast as the computer's internal links, the machine disintegrates across the net into a set of special purpose appliances”
(George Gilder)
My Presentation
• The emergence of the Grid concept– Origins in eScience, and the Globus Toolkit
• Grids and e-business– Opportunities & requirements
• Technology convergence– Open Grid Services Architecture
• Summary
My Presentation
• The emergence of the Grid concept– Origins in eScience, and the Globus Toolkit
• Grids and e-business– Opportunities & requirements
• Technology convergence– Open Grid Services Architecture
• Summary
E-Science: The Original Grid Driver• Pre-electronic science
– Theorize &/or experiment, in small teams
• Post-electronic science– Construct and mine very large databases– Develop computer simulations & analyses– Access specialized devices remotely– Exchange information within distributed
multidisciplinary teams Need to manage dynamic, distributed
infrastructures, services, and applications
And Thus: The Grid “Resource sharing & coordinated
problem solving in dynamic, multi-institutional virtual organizations”
• Early 90s–Gigabit testbeds, metacomputing
• Mid to late 90s–Early experiments (e.g., I-WAY), academic
software projects (e.g., Globus), applications
• 2002–Dozens of application communities & projects–Significant technology base (Globus ToolkitTM)–Global Grid Forum: ~500 people, 20+ countries
The Grid:A Brief History
Sloan Digital Sky Survey Analysis
Size distribution ofgalaxy clusters?
Sloan Digital Sky Survey Analysis
1
10
100
1000
10000
100000
1 10 100
Num
ber
of C
lust
ers
Number of Galaxies
Galaxy clustersize distribution
Chimera Virtual Data System+ iVDGL Data Grid (many CPUs)
A Large Virtual Organization: CERN’s Large Hadron Collider
1800 Physicists, 150 Institutes, 32 Countries
100 PB of data by 2010; 50,000 CPUs?
Data Grids for High Energy Physics
Tier2 Centre ~1 TIPS
Online System
Offline Processor Farm
~20 TIPS
CERN Computer Centre
FermiLab ~4 TIPSFrance Regional Centre
Italy Regional Centre
Germany Regional Centre
InstituteInstituteInstituteInstitute ~0.25TIPS
Physicist workstations
~100 MBytes/sec
~100 MBytes/sec
~622 Mbits/sec
~1 MBytes/sec
There is a “bunch crossing” every 25 nsecs.
There are 100 “triggers” per second
Each triggered event is ~1 MByte in size
Physicists work on analysis “channels”.
Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server
Physics data cache
~PBytes/sec
~622 Mbits/sec or Air Freight (deprecated)
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Caltech ~1 TIPS
~622 Mbits/sec
Tier 0Tier 0
Tier 1Tier 1
Tier 2Tier 2
Tier 4Tier 4
1 TIPS is approximately 25,000
SpecInt95 equivalents
•Lift Capabilities•Drag Capabilities•Responsiveness
•Deflection capabilities•Responsiveness
•Thrust performance•Reverse Thrust performance•Responsiveness•Fuel Consumption
•Braking performance•Steering capabilities•Traction•Dampening capabilities
Crew Capabilities- accuracy- perception- stamina- re-action times- SOPs
Engine Models
Airframe Models
Wing Models
Landing Gear Models
Stabilizer Models
Human Models
Grids at NASA: Aviation Safety
NETWORK
IMAGINGINSTRUMENTS
COMPUTATIONALRESOURCES
LARGE DATABASES
DATA ACQUISITIONPROCESSING,
ANALYSISADVANCED
VISUALIZATION
Life Sciences: Telemicroscopy
Underlying Technical Requirements
• Dynamic formation and management of virtual organizations
• Online negotiation of access to services: who, what, why, when, how
• Configuration of applications and systems able to deliver multiple qualities of service
• Autonomic management of distributed infrastructures, services, and applications
State of the Art:Globus ToolkitTM (since 1996)
• Small, standards-based set of protocols– Authentication, delegation; resource
discovery; reliable invocation; etc.
• Information-centric design– Data models; publication, discovery protocols
• Open source implementation & community– With commercial support
• Enabler of services and applications
Grid Projects in eScience
My Presentation
• The emergence of the Grid concept– Origins in eScience, and the Globus Toolkit
• Grids and e-business– Opportunities & requirements
• Technology convergence– Open Grid Services Architecture
• Summary
And What About Business?• Fragmentation of enterprise infrastructure
– Specialized platforms -> commodity servers– “Intelligence” embedded in networks
• The rise of the “eUtility” (IBM, HP, …)– Outsourcing, economies of scale
• Business-to-business computing– Especially complex virtual organizations
• Ever more challenging QoS requirements– “Green-screen” -> “ubiquitious web presence”
Today’s EnterpriseComputing Environment
The Business Opportunity
• On-demand computing, storage, services– Significant savings due to reduced build-out,
economies of scale, reduced admin costs– Greater flexibility => greater productivity
• Entirely new applications and services– Based on high-speed resource integration
• Solution to enterprise computing crisis– Render distributed infrastructures manageable
Grids and Industry: Early Examples
Butterfly.net: Grid for multi-player
games
Entropia: Distributed computing (BMS, Novartis, …)
Realizing the PromiseRequires Significant Innovation
• Automation of infrastructure operation to achieve economies of scale
• Management and component models for distributed service provisioning
• New applications and tools powered by distributed services and resources
• Business and service models to support specialization of function
My Presentation
• The emergence of the Grid concept– Origins in eScience, and the Globus Toolkit
• Grids and e-business– Opportunities & requirements
• Technology convergence– Open Grid Services Architecture
• Summary
Grid Evolution:Open Grid Services Architecture• Refactor Globus protocol suite to enable
common base and expose key capabilities
• Service orientation to virtualize resources and unify resources/services/information
• Embrace key Web services technologies: standard IDL, leverage commercial efforts
• Result: standard interfaces & behaviors for distributed system management
Transient Service Instances• “Web services” address discovery & invocation
of persistent services– Interface to persistent state of entire enterprise
• In Grids, must also support transient service instances, created/destroyed dynamically– Interfaces to the states of distributed activities– E.g. workflow, video conf., dist. data analysis
• Significant implications for how services are managed, named, discovered, and used
Open Grid Services Architecture
• Defines fundamental (WSDL) interfaces and behaviors that define a Grid Service– Required + optional interfaces = WS “profile”
• Defines WSDL extensibility elements– E.g., serviceType (a group of portTypes)
• Open source Globus Toolkit 3.0– Leverage GT experience, code, community
• And also commercial implementations
The Grid Service =Interfaces/Behaviors + Service Data
Servicedata
element
Servicedata
element
Servicedata
element
Implementation
GridService(required)Service data access
Explicit destructionSoft-state lifetime
… other interfaces …(optional) Standard:
- Notification- Authorization- Service creation- Service registry- Manageability- Concurrency
+ application-specific interfaces
Binding properties:- Reliable invocation- Authentication
Hosting environment/runtime(“C”, J2EE, .NET, …)
Example: Database Service• DBaccess Grid service supports at least two
portTypes– GridService– DBaccess
• Each has service data– GridService: basic introspection, lifetime, …– DBaccess: database type, current load, …, …
• Maybe other portTypes as well– E.g., NotificationSource (SDE = subscribers)
GridService DBaccess
DB info
Name, lifetime, etc.
Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider
“I want to createa personal databasecontaining data one.coli metabolism”
.
.
.
DatabaseFactory
Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
“Find me a data mining service, and somewhere to store
data”
DatabaseFactory
Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
Handles for Miningand Database factories
DatabaseFactory
Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
“Create a data mining service with initial lifetime 10”
“Create adatabase with initial lifetime 1000”
DatabaseFactory
Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
DatabaseFactory
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
Database
Miner
“Create a data mining service with initial lifetime 10”
“Create adatabase with initial lifetime 1000”
Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
DatabaseFactory
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
Database
Miner
Query
Query
Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
DatabaseFactory
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
Database
Miner
Query
Query
Keepalive
Keepalive
Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
DatabaseFactory
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
Database
MinerKeepalive
KeepaliveResults
Results
Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
DatabaseFactory
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
Database
Miner
Keepalive
Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
DatabaseFactory
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
Database
Keepalive
GT3: OGSA-Based Globus Toolkit• GT3 Core
– Grid service interfaces– Reference impln of
evolving standard– Multiple hosting envs:
Java/J2EE, C, C#/.NET?
• GT3 Base Services– Globus capabilities
• Many other services
GT3 Core
GT3 Base Services
Other GridServicesGT3
DataServices
OGSA: Current Status• Grid service specification & other documents
moving forward in GGF– www.gridforum.org/ogsi-wg
• Globus Project on track for open source OGSA-based GT3 release end of 2002– www.globus.org/ogsa
• IBM committed to various OGSA-compliant software releases (e.g., WebSphere)
• Other industrial efforts underway
My Presentation
• The emergence of the Grid concept– Origins in eScience, and the Globus Toolkit
• Grids and e-business– Opportunities & requirements
• Technology convergence– Open Grid Services Architecture
• Summary
Summary: The Grid
• Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations– On-demand, ubiquitous access to computing,
data, and services– New capabilities constructed dynamically and
transparently from distributed services
• Evolved to be dominant eScience, now transitioning to industry (think Web in 1994)
Open Grid Services Architecture• Open Grid Services Architecture represents
next step in Grid evolution
• Service orientation enables unified treatment of resources, data, and services
• Standard interfaces and behaviors (the Grid service) for managing distributed state
• Open source Globus Toolkit implementation (and numerous commercial value adds)
For More Information
• Grid Book– www.mkp.com/grids
• Survey articles– www.mcs.anl.gov/~foster
• The Globus Project™– www.globus.org
• Global Grid Forum– www.gridforum.org– Edinburgh, July 22-24– Chicago, Oct 15-17