View
17
Download
0
Category
Preview:
Citation preview
Programming the Grid
Stefan JähnichenAndreas Hoheisel
Page
Grid Computing: Computational power out of the plug
Page
Distributed Computing Grid Computing
(Source: EGEE)
„Computational power out of the plug“ (electric power grid)
– Virtualization of resources
– Interoperability by means of standards
– Resource sharing crossing organizational boundaries
– Resources: hardware, software, services, data
Grid technology provides standardized and reliable access to distributed resources
Page
Hardware Grid / Resource Grid
– Connecting hardware resources of several
organizations in order to solve big calculation
problems, which are not solvable by single mainframe
computers or clusters
– More effective capacity utilization and less cost for
acquisition and operation of new resources
– Typical applications:
Simulations, parameter studies, rendering farms, …
– Example climateprediction.net: > 100,000 CPUs
Source: climateprediction.net
Page
Data Grid / Information Grid
– Huge data volumes
– Distributed databases
– Typical applications:
Data archives, post processing
– Example LHC Grid (CERN): 15 Petabyte/Year
Source: CERN
Page
Service Grid / Software Grid
– Sharing of software and services crossing organizational boundaries
– Coupled and distributed data processing
– Convergence with the paradigm of the “Service Oriented Architecture” (SOA)
– Typical applications:
Collaborative working crossing organizational boundaries
On demand computing
Multimedia computing
Automation of processes (workflows)
– Example: MediGRID – Grid Computing for life sciences and medicine
Classes of Computing Grids
Resource / Hardware Grid
– Typical applications:Simulations, parameter studies, rendering farms, …
– Example: climateprediction.net: > 100,000 CPUs
Data Grid / Information Grid
– Huge data volumes, distributed databases
– Typical applications: Data archives, post processing
– Example LHC Grid (CERN): 15 Petabyte/Year
Service Grid / Software Grid
– Convergence with the paradigm of the “Service Oriented Architecture” (SOA), CLOUDS
Source: climateprediction.net
Source: CERN
Grid Computing vs. Computing Cloud
– Grid Computing: Distributed computing whereby a "super and virtual computer" is composed as a cluster of networked, loosely-coupled computers, acting in concert to perform very large tasks Concept for distributed applications
– Computing Cloud:Abstraction layer for software resources using the paradigm of “software as a service (SaaS)” Concept for big resource providers (Google, Amazon, IBM, SUN, …)
Computing Clouds can be used as resources within Grid Computing
Programming the Grid with Workflows
Workflows = Automation of IT processes
– Speed up IT processes (e.g., by distributing the tasks to many
resources and optimizing the data and control flow)
– Simplification of the composition and enactment of IT
processes
– Increase reusability (e.g. realize similar processes on different
systems and infrastructures)
– Increase fault tolerance and dependability of your IT processes
– Fast implementation and adaption of business processes by
means of simple mapping onto IT workflows
Particular challenges for IT processes in the Grid
– Distributed resources
– Abstraction and virtualization
– Dynamics and fault tolerance
– Scalability
– Virtual organizations
– Security concerns
– Persistence and state-full services
– Autonomy
From the orchestration of services to autonomous behavior for solving complex problems – based on process modelling and automation
Vision
Process Description Languages
Web Services SCUFL (Taverna – myGrid)BPEL4WS (IBM, BEA Systems, Microsoft)
Business Processes Event-driven processes (AML, EPML)XPDL (Wf Management Coalition)BPMN
Grid Condor DAGman, UNICORE, GSFL
Our solution: GWorkflowDLUses approaches of Web-Services, business processes and Grid in one single Language Based on Petri Nets (similar to event-driven process chains and UML2 activity diagrams)
Modeling of processes using High Level Petri Nets
– State and actions are modeled
– Control- and data flow is modeled
– Simple and expressive
– Description of distributed (concurrent) processes
– Extensive theory available
– Intuitive visualization available
– International standard: ISO/IEC 15909-1
– Conversion from other process description languages possible (e.g. BPEL, ARIS-
Toolset, EPK, PNML, ...)
Resource Matching and Scheduling
– Grid resources are dynamic: resources can fail, new resources may appear and active resources may be removed from the Grid at any time
– Therefore abstract (infrastructure-independent) workflows should be composed
– These abstract workflows are mapped onto available resources during runtime
Fraunhofer FIRST: Management of Distributed Workflows
Grid Workflow Execution Service
– Automation of IT processes.
– Enabling workflows on traditional batch processes as well as on service-oriented architectures.
– Automatic mapping of abstract workflows onto available and suitable resources.
– Easy integration into existing IT infrastructures and business processes.
– Application domains: Bioinformatics, traffic management, environmental simulations, risk analysis, resource planning, image processing, …
Page
Application Domains
Environmental Simulation
Drug Design & Health Care
Traffic Management
Media: Cinema and TV
Page
Environmental Simulation
Environmental Risk Analysis and Management System
– Use case: After an accident with release of toxic substances the action forces require a scientific prediction within 10 minutes of the districts which should be evacuated
– Solution: Using Grid technology to distribute the simulation and automate the process on available resources in a secure and reliable way
Page
Drug Design
Example:
– Enterprise Grid for the distributed
execution of legacy customer
applications
(Schering AG Berlin)
– Utilization of idle computing power,
e.g., desktop PCs during night
Page
Medical Image Processing with > 3200 CPUs
Objective– Speed up and simplification of processes in the
domain of medical image processing for virtual organizations
Approach– Formal correct modeling of medical IT processes– Integration of data formats and standards (DICOM,
HL7, PACS) – Tools for validation, simulation and optimization of
workflows regarding fault tolerance and dependability
Application– Analysis of 3D ultrasonic data Charité Berlin– Virtual vascular surgery– Functional brain image data (fMRI)
Page
Media Workflows
– Postprocessing of digital movies
– Image distortion correction and calibration for digital multi-beamer projections
– Transcoding of movies (HDTV Mobile Devices)
Programming Swarmsystems - Ensembles
‣ We need global and scalable solutions
‣ autonomous infrastructure for swarm applications -- nano- und picosatellites
‣ dynamics are complicated in space
‣ sensorfusion is a must for data exploitation
‣ only few experiments
‣ swarm ensembles are of major interest
Kayser Threde Auxiliary Payloadcarrier (KAP)
Atmosphäre
Ionosphäre
GPSReferenzsignal
Gebrochenes, verzögertes,abgeschwächtes
GPS Signal
GPS Satellit
GPS Satellit
Schwarm
Our Vision
Many Thanks for your Attention!
Any Questions ?
stefan.jaehnichen@first.fraunhofer.de
andreas.hoheisel@first.fraunhofer.de
http://www.first.fraunhofer.de/
http://www.gridworkflow.org/
Recommended