Outline
This talk gives a high-level view of application development in GridsContents Review of concepts: grids and grid applications Types of Grid applications Challenges to researchers who write applications General steps of application gridification Practical: Preparing and submitting a job based on a simple
non-grid application.Acknowledgements Gergely Sipos, SZTAKI, Hungary Mike Mineter, University of Edinburgh, UK, “Application
Development and Aspects of gLite 3.0”, 2006 Vladimir Dimitrov, IPP-BAS, Bulgaria, “Practical: Porting applications
to the GILDA grid”, Introduction to Grid Computing, EGEE and Bulgarian Grid Initiatives Plovdiv, 2006
Douglas Thain, The University of Notre Dame GILDA team
DefinitionSoftware that interacts with grid services to achieve requirements that are specific to a particular VO or user.
What is a grid application?
Grids: a foundation for e-Research
Enabling a whole-system approachCollaborative research / engineering / public service …
sensor nets
Shared data archives
computers
software
colleagues
instruments
Grid
Diagram derived fromIan Foster’s slide
The vital layer
Where computer science meets the application communities!
VO-specific developments built on higher-level tools and core services
Makes Grid services useable by non-specialists
Grids provide the compute and data storage resources
Basic Grid services:AA, job submission, info, …
Higher-level grid services (brokering,…)
Application toolkits, …..
Application
Production grids provide these core services.
Complexities of grid applications
1. Simple jobs – submitted to WMS to run in batch mode2. Job invokes grid services
• To read & write files on SE• Monitoring• For outbound connectivity (interactive jobs)• To manage metadata• …
3. Complex jobs• An environment controls multiple jobs on users’ behalf
• High-level services• Portals with workflow • Software written for the VO (or by the user)• …
Invocation of applications
From the UI Command Line Interfaces / Scripts APIs Higher level tools
From desktop Windows applications Use Grids without awareness of them! But gLite not (yet) fully supporting Windows (more info: http://jessica.trigrid.it/grid2win/)
From portals For recurring tasks: “core grid services” as well as application
layer Accessible from any browser Tailored to applications Different portal solutions, and wide range of capabilities.
Second part of this course: P-GRADE Portal
EGEE Gride.g. VOCE
UK NGS
P-GRADE-Portal
LondonRome
Athens
Different jobs of a workflow can be executed in different grids
The portal can be connected to multiple grids
Multi-Grid P-GRADE Portal
Characteristics of VOs
What is being shared? resources of storage and/or compute cycles software and/or data
Distinct groups of developers and of users? Some VOs have distinct groups of developers and users…
Biomedical applications used by clinicians,…. …. Some don’t
Physics application developers who share data but write own analyses
Effect: need to hide complexity from the 1st type of VOs expose functionality to 2nd type of Vos many security issues
Different Goals for App. Development
I need resources for my research I need richer functionality
MPI, parametric sweeps,… Data and compute services together…
I provide an application for (y)our research How!?
Pre-install executables ? Hosting environment? Share data Use it via portal?
We provide applications for (y)our research Also need:
Coordination of development Standards …
En
gin
eering
challen
ges in
creasing
Challenges
Research software is often Created for one
user: the developer
Familiarity makes it useable
Short-term goals: Used until papers are written and then discarded
Grid applications are often used by a VO Without support
from developer In new contexts
and workflows
Grid application developers areIn a research environment Yet their s/w must have:
StabilityDocumentationUsabilityExtendability
i.e. Production quality
Need expertise in:
• software engineering
• application domain
• grid computing
Consequences
Team work!Engaged in world-wide initiatives – reuse, don’t make your own! Cross disciplines for solutions.From research to production software: ~5 times the effort. “80% of the time for last 10% of the functionality & reliability”
Standardisation is key For re-use, for dynamic configuration of services,.. Both for middleware and domain specific
Need to follow a deliberate development process Waterfall? Rapid prototyping? Requirements engineering, design, implementation, validation,
deployment Engaged with the user community
gLite Grid Middleware Services
API
Access
Workload Mgmt Services
ComputingElement
WorkloadManagement
MetadataCatalog
Data Management
StorageElement
DataMovement
File & ReplicaCatalog
Authorization
Security Services
Authentication
Information &Monitoring
Information & Monitoring Services
Application
Monitoring
Connectivity
Accounting
Auditing
JobProvenance
PackageManager
CLI
More about gLite services
During today the focus is on: New functionality in gLite 3.0 Workload Management Accessing data on SEs
Can have massive files, too big to copy How to access these?
Management of metadata May have many thousands of files Need to access and re-use based on characteristics… more
than by their logical file names. Monitoring of applications
May be running many long jobs What’s happening?!
Workload Management System
Helps the user accessing computing resources resource brokering management of input and output management of complex workflows
Support for MPI job even if the file system is not shared between CE and Worker Nodes (WN) – easy JDL extensions
Web Service interface via WMProxy
WMProxy
WMProxy is a SOAP Web service providing access to the Workload Management System (WMS)Job characteristics specified via JDL jobRegister
create id map to local user
and create job dir register to L&B return id to user
input files transfer jobStart
register sub-jobs to L&B
map to local user and create sub-job dir’s
unpack sub-job files deliver jobs to WM
Complex Workflows
Direct Acyclic Graph (DAG) is a set of jobs where the input, output, or execution of one or more jobs depends on one or more other jobs A Collection is a group of jobs with no dependencies basically a collection of JDL’s
A Parametric job is a job having one or more attributes in the JDL that vary their values according to parametersUsing compound jobs it is possible to have one shot submission of a (possibly very large, up to thousands) group of jobs Submission time reduction
Single call to WMProxy server Single Authentication and Authorization process Sharing of files between jobs
Availability of both a single Job Id to manage the group as a whole and an Id for each single job in the group
nodeEnodeC
nodeA
nodeD
nodeB
Basic taskswhile Porting applications to the Grid
1. Developing a non-grid application (or inheriting and updating an ancient one);
2. Go/no-Go decision about gridification• Is it suitable for the Grid environment?• “Cost/profit” analysis / feasibility study
Typical Questions Groups: Current structure of the application Dependencies of the application Available resources (manpower, knowledge, etc.) Requirements for the gridified application Expected impact of gridification Requirements for the grid infrastructure
More info: Application Description Template http://www.lpds.sztaki.hu/gasuc/?m=4
Basic tasks while Porting applications to the Grid (contd.)
3. Grid environment access• Requesting Certificates / VO membership• Accessing Grid environment
• Appropriate VO UI machine account for command line• Portal GUI account;
4. Executing, Testing and Debugging the application;• Testing the non-grid application (debugging in Grid
environment is a hard task), creating use cases for single (non-grid) runs;
5. Constructing the job suite – JDL (Job Description Language) files, executables, auxiliary scripts and input/output data files;
6. Submitting the job to the Grid as small-scale pilot application;
Basic tasks while Porting applications to the Grid (contd.)
7. Executing, Testing and Debugging the pilot application;
8. IF something goes wrongTHEN GOTO 4;
9. IF everything seems to workTHEN increase the scale of the application (increase problem size, amount of used resources);
10. Optimizing the grid application;
Code from the past, maintained because it worksOften supports business critical functions
Not Grid enabled
• Port them onto the Grid with minimum user effort
What to do with legacy codes in service Grids?
• Bin them and reimplement them as grid services
• Reengineer them who knows the source code?
Porting legacy code applications
GEMLCA – Grid Execution Management for Legacy Code Architecture
• To deploy legacy code applications as Grid services without reengineering the original code and minimal user effort
• To support the development and execution of “legacy code” grid service workflows
• To make these functions available from a Web Portal
GEMLCA
GEMLCAP-GRADE
Portal
Objectives
Porting legacy code applications with GEMLCA
Any code that correspond to the following model can be exposed as Grid service by GEMLCA:
Legacy codeLegacy codeInput data Input data
(files, command (files, command line params, env. line params, env.
vars)vars)
Output data Output data (files)(files)
Call shared librariesCall shared libraries
Query databasesQuery databasesCommunicate over the networkCommunicate over the network
Perform computationPerform computation
. . .. . .
The GEMLCA-view of a legacy code
Workflow to analyse road traffic
Manhattan road Manhattan road network generatornetwork generator
Traffic simulatorsTraffic simulators
AnalyserAnalyser
Legacy Application example
Molecular Dynamics Study of Water Penetration in Staphylococcal Nuclease using CHARMm
• Analysis of several production runs with different parameters following a common heating and equilibrium phase
Another Legacy Application example
Mercury monitor
- to debug/optimize and visualize parallel jobs.
- both at the workflow and job levels
Practical tools @ gridification
On-Line Monitoring
Practical tools @ gridification (contd.)
Hiding Grid remote storage system from a legacy applicationParrot a handy tool for attaching old programs to new storage
systems EGEE (gLite module) Data Access: GFAL, LFN, GUID, SRM, RFIO,
DCAP, and LFC does not require any special privileges, any recompiling to
existing programs For example, an anonymous FTP service is made available to vi
like so:parrot vi /anonftp/ftp.cs.wisc.edu/RoadMap
Or example with gsiftp:$./parrot app_name /gsiftp/<gsiftp usr without
gsiftp://>
(More info: http://www.cse.nd.edu/~ccl/software/manuals/parrot.html)
Common problems and obstacles
The candidate-applications for porting usually are huge and complex.
Some of them use low-level network functions and/or parallel execution features of a specific non-grid environment.
Usage of non-standard or proprietary communication protocols.
The complete source code might not be available, might not be well documented or its “out-of-host” usage is restricted by a license agreement.
Common obstacles (continued)
The application might be written in many different programming languages – C, C++, C#, Java, FORTRAN etc. or even mixture of them.
Applications may depend on third-party libraries or executables which are not available by default on some Grid worker nodes.
Some application features could cause unintentional violation of Grid Acceptable Use Policies (Grid AUP).
Furthermore, the application can have hidden security weakness which will be very dangerous in case of remote Grid job execution.
Common obstacles (continued)
Some applications are pre-compiled or optimized for using on a machine with particular processor(s) only – Intel, AMD, in 32-bit or 64-bit mode, etc. But the Grid is heterogeneous!
The application may contain serious bugs, which have never been detected while running in a non-grid environment.
Finally, the formal procedure for accepting a new application to be ported to a Grid for production or even experimental purpose is not simple.
Therefore, the porting of an arbitrary application to Grid could be very long, difficult and expensive process!
GASUC Grid Application Support Centre AIM of the Center Provides assistance to current and future grid users and grid application
developers to port legacy algorithms and applications onto large-scale grid infrastructures.
Creates a bridge between Academy/Industry and Grid Supports and develops Grid applications for Grid communities Disseminates knowledge of concepts and tools used on Globus Toolkit 2,
Globus Toolkit 4, LCG-2 and gLite grid environments.Established in January 2007Full EGEE support Announced officially in EGEE Follow up in EGEE3
Cost model For non-commercial applications: free
http://www.lpds.sztaki.hu/gasuc/
Grid Application Development Support Model
8 steps support model Contact phase Pre-selection phase Analysis phase Planning phase Prototyping phase Testing phase Execution phase Dissemination and feedback phase
In the practical session
The application called MatrixDemo will be ported and executed in SGDEMO grid environment. (The program is borrowed from the “EGEE summer school” at MTA SZTAKI, 2006.)
MatrixDemo is written in C programming language
SGDEMO environment (gLite based) is supporting C, so porting the C or C++ programs is easy … hopefully
MatrixDemo program
MatrixDemo program performs some matrix operations – inverting, multiplying, etc.
Usage:MatrixDemo has command line interface which accepts several arguments. Starting the program without any argument will display a short help. Example:
MatrixDemo I V
This will Invert (I) the matrix defined in the file named INPUT1 and will store the result in the file OUTPUT with verbose details (V).
MatrixDemo program (continued)
Prerequisites: File MatrixDemo.c – the source code of the program. Files INPUT1 and INPUT2 – they contain matrix data in
the following text format:rows, columns, cell1, cell2, cell3 …
Where rows is an integer representing the number of rows. columns represents number of columns, and cell1, cell2 etc. are the cells of the matrix, floating point numbers separated by commas (,).
A standard C compiler and linker. In this case we will use GNU C (gcc) already installed.
File MatrixDemo.jdl – a prepared JDL (Job Description Language) file.