Upload
osborn-paul
View
215
Download
0
Embed Size (px)
Citation preview
1
Science Gateways WorkshopGGF14
Nancy Wilkins-DiehrTeraGrid Area Director for Science
GatewaysGGF14, ChicagoJune 28, 2005
Science Gateways Workshop June 28, 2005 2
Program Committee Nancy Wilkins-Diehr (SDSC, USA) (co-chair) Sebastien Goasguen (Purdue University, USA) (co-chair) Ariel Oleksiak (Poznan Supercomputing and Networking
Center, Poland) Jarek Nabrzyski (Poznan Supercomputing and Networking
Center, Poland) Charlie Catlett (University of Chicago and Argonne National
Laboratory, USA) Ian Foster (University of Chicago and Argonne National
Laboratory, USA) Dennis Gannon (Indiana University, USA) Satoshi Sekiguchi (AIST, Japan) Sang Beom Lim (KISTI, Korea) Konstantinos Dolkas (National Technical University of
Athens(NTUA))
Science Gateways Workshop June 28, 2005 3
Welcome and Thank You Many fine talks today from researchers or resource providers
who are bringing Grid capabilities to a particular science community (atmospheric scientists, chemists, bioinformaticists, etc.)
Explore and summarize commonalities and differences - system, security, accounting, authentication/authorization and other policies and capabilities needed for production grid support
Presentations will cover: Services provided and technologies/software used to provide them Configuration or policy issues encountered during deployment and
maintenance Authentication and authorization approaches to support a variety
of user “types” Practical issues related to supporting workflows Approaches to providing secure web services
Science Gateways Workshop June 28, 2005 4
Your participation will make the workshop a success
Five 90 minute sessions Four presentations followed by a discussion Interactive discussions encouraged! Questions from moderators to initiate dialogue
Detailed notes will be taken Workshop proceedings will be available as
GGF informational document Peer-reviewed papers to be published in
special issue of Concurrency and Computation: Practice and Experience in early fall
Science Gateways Workshop June 28, 2005 5
GCE-RG at GGF
Grid Computing Environments Research Group Co-chaired by Geoffrey Fox, Dennis Gannon, IU,
Mary Thomas, SDSU Addresses many of the issues presented in this
workshop Marlon Pierce, IU here to discuss current
activities Meeting 6/29, 7:30-9am Next steps from this workshop will be part of
ongoing GCE-RG activities
Science Gateways Workshop June 28, 2005 6
Why a workshop on Science Gateways?
My day job – TeraGrid Area Director for Science Gateways
10 Science Gateway projects in TeraGrid I need to make these successful
New activity, funding begins this summer Interviews conducted with all 10 teams,
findings summarized Interest in what others are doing in this
area
Science Gateways Workshop June 28, 2005 7
The TeraGrid Strategy Building a distributed system
of unprecedented scale 40+ teraflops compute 1+ petabyte storage 10-40Gb/s networking
Creating a unified user environment across heterogeneous resources
User software environment, User support resources.
Created an initial community of over 500 users, 80 PI’s.
Integrating new partners to introduce new capabilities
Additional computing, visualization capabilities
New types of resources- data collections, instruments
Make it extensible!
Science Gateways Workshop June 28, 2005 8
TeraGrid Resource Partners
Science Gateways Workshop June 28, 2005 9
TeraGrid ResourcesANL/UC Caltec
hIU NCSA ORNL PSC Purdu
eSDSC TACC
ComputeResources
Itanium2(0.5 TF)
IA-32(0.5 TF)
Itanium2(0.8 TF)
Itanium2(0.2 TF)
IA-32(2.0 TF)
Itanium2 (10 TF)
SGI SMP(6.5 TF)
IA-32(0.3 TF)
XT3(10 TF)TCS (6 TF)Marvel(0.3 TF)
Hetero (1.7 TF)
Itanium2(4.4 TF)
Power4+(1.1 TF)
IA-32(6.3 TF)
Sun (Vis)
Online Storage
20 TB 155 TB 32 TB 600 TB 1 TB 150 TB
540 TB 50 TB
MassStorage
1.2 PB 3 PB 2.4 PB 6 PB 2 PB
Data Collections
Yes Yes Yes Yes Yes
Visualization
Yes Yes Yes Yes Yes
Instruments Yes Yes Yes
Network(Gb/s,Hub)
30CHI
30LA
10CHI
30CHI
10ATL
30CHI
10CHI
30LA
10CHI
Partners will add resources and TeraGrid will add partners!
Science Gateways Workshop June 28, 2005 10
Science GatewaysA new initiative for the TeraGrid
Increasing investment by communities to build their own cyberinfrastructure.
Heterogeneity Resources - different architectures at local, national and
international levels Users- from HPC expert to K-12 student…they should all
benefit from CI Software stacks, policies
How can “centers/institutions” provide, operate, maintain in this heterogeneous world ?
Working with Gateways, TeraGrid will start to answer that question by providing generic CI services to communities.
Integration and interoperability
Science Gateways Workshop June 28, 2005 11
What are Gateways? Gateways will
engage communities that are not traditional users of the supercomputing centers
by providing community-tailored access to TeraGrid services
and capabilities Three examples:
Web-based Portals that front-end Grid Services that provide teragrid-deployed applications used by a community.
Coordinated access points enabling users to move seamlessly between TeraGrid and other grids.
Application programs running on users' machines but accessing services in TeraGrid (and elsewhere)
All take advantage of existing community investment in software, services, education, and other components of Cyberinfrastructure.
Science Gateways Workshop June 28, 2005 12
Grid Portal Gateways The Portal accessed through a
browser or desktop tools Provides Grid authentication and
access to services Provide direct access to TeraGrid
hosted applications as services
The Required Support Services Searchable Metadata catalogs Information Space Management. Workflow managers Resource brokers Application deployment services Authorization services.
Builds on NSF & DOE software Use NMI Portal Framework, GridPort NMI Grid Tools: Condor, Globus,
etc. OSG, HEP tools: Clarens, MonaLisa
Technical Approach
Biomedical and Biology, Building Biomedical Communities
OG
CE
Sc
ien
ce
Po
rta
l
OGCE Portletswith ContainerOGCE Portletswith Container
Apache JetspeedInternal ServicesApache JetspeedInternal Services
ServiceAPI
ServiceAPI
GridProtocols
GridServiceStubs
GridServiceStubs
RemoteContentServices
RemoteContentServices
RemoteContentServersHTTP
GridService
s
Java
Co
G K
it
LocalPortal
Services
LocalPortal
Services
Grid Resources
Open Source Tools
Build standard portals to meet the domain requirements of the biology communitiesDevelop federated databases to be replicated and shared across TeraGrid
Workflow Composer
Science Gateways Workshop June 28, 2005 13
Gateways that Bridge to Community Grids
Many Community Grids already exist or are being built
NEESGrid, LIGO, Earth Systems Grid, NVO, Open Science Grid, etc.
TeraGrid will provide a service framework to enable access in ways that are transparent to their users.
The community maintains and controls the Gateway
Different Communities have different requirements.
NEES and LEAD will use TeraGrid to provision compute services
LIGO and NVO have substantial data distribution problems.
All of them require remote execution of complex workflows.
Technical Approach
•Develop web services interfaces (wrappers) for existingand emerging bioinformatics tools
• Integrate of collections of tools into Life Science servicebundles that can be deployed as persistent services onTeraGrid resources
• Integration of TG hosted Life Science services withexisting end-user tools to provide scalable analysiscapabilities
Existing User Tools(e.g. GenDB)
Life ScienceGatewayService
Dispatcher
Web ServicesInterfaces forBackendComputing
Life Science Services Bundles
..
..
..
..
TeraGridResource
Partners
On-DemandGrid Computing
StreamingObservations
Forecast Model
Data Mining
Storms Forming
Science Communities and Outreach
• Communities• CERN’s Large Hadron Collider
experiments• Physicists working in HEP and
similarly data intensive scientificdisciplines
• National collaborators and thoseacross the digital divide indisadvantaged countries
• Scope• Interoperation between LHC
Data Grid Hierarchy and ETF• Create and Deploy Scientific
Data and Services Grid Portals• Bring the Power of ETF to bear
on LHC Physics Analysis: Helpdiscover the Higgs Boson!
• Partners• Caltech• University of Florida• Open Science Grid and Grid3• Fermilab• DOE PPDG• CERN• NSF GriPhyn and iVDGL• EU LCG and EGEE• Brazil (UERJ,…)• Pakistan (NUST,…)• Korea (KAIST,…)
LHC Data Distribution Model
Science Gateways Workshop June 28, 2005 14
Science Gateway Prototype Discipline Science Partner(s) TeraGrid Liaison
Linked Environments for Atmospheric Discovery (LEAD)
Atmospheric Droegemeier (OU) Gannon (IU), Pennington (NCSA)
National Virtual Observatory (NVO)
Astronomy Szalay (Johns Hopkins) Williams (Caltech)
Network for Computational Nanotechnology (NCN) and “nanoHUB”
Nanotechnology Lundstrum (PU) Goasguen (PU)
National Microbial Pathogen Data Resource Center (NMPDR)
Biomedicine and Biology Schneewind (UC), Osterman (Burnham/UCSD), DeLong (MIT), Dusko (INRA)
Stevens (UC/Argonne)
NSF National Evolutionary Biology Center (NESC), NIH Carolina Center for Exploratory Genetic Analysis, State of North Carolina Bioinformatics Portal project
Biomedicine and Biology Cunningham (Duke), Magnuson (UNC)
Reed (UNC), Blatecky (UNC)
Neutron Science Instrument Gateway
Physics Dunning (ORNL) Cobb (ORNL)
Grid Analysis Environment High-Energy Physics Newman (Caltech) Bunn (Caltech)
Transportation System Decision Support
Homeland Security Stephen Eubanks (LANL) Beckman (Argonne)
Groundwater/Flood Modeling Environmental Wells (UT-Austin), Engel (ORNL) Boisseau (TACC)
Science Grid [GrPhyN/ivDGL/Grid3]
Multiple Pordes (FNAL), Huth (Harvard), Avery (Uflorida)
Foster (UC/Argonne), Kesselman (USC-ISI), Livny (UW)
Initial Focus on 10 Gateways
Science Gateways Workshop June 28, 2005 15
Expanding User Base
0
1000
2000
3000
4000
5000
6000
1 2 3 4 5
OSG
Flood
HEP
SNS
NESC/CCEGA
OLSG
NCN
NVO
LEAD
0
1000
2000
3000
4000
5000
6000
2005 2006 2007 2008 2009
OSG
Flood
HEP
SNS
NESC/CCEGA
OLSG
NCN
NVO
LEAD
A new generation of “users” that access TeraGrid via Science Gateways, scaling well beyond the traditional “user” with a shell login account.
Projected user community size by each science gateway project.
Impact on society from gateways enabling decision support is much larger!
Science Gateways Workshop June 28, 2005 16
LEAD (Linked Environments for Atmospheric Discovery )
•Storm forecasting
•Modeling
•Connection to sensor networks
•LEAD tesbed
•Workflows
•Student usage
Science Gateways Workshop June 28, 2005 17
Harnessing TeraGrid for Education Example: Nanohub is used to complete coursework by
undergraduate and graduate students in dozens of courses at 10 universities.
Science Gateways Workshop June 28, 2005 18
Biomedical and Biology Building Biomedical Communities – Dan Reed (UNC)
National Evolutionary Synthesis Center Carolina Center for Exploratory Genetic Analysis
Portals and federated databases for the Biomed research community
Identify Genes
Phenotype 1 Phenotype 2 Phenotype 3 Phenotype 4
Predictive Disease Susceptibility
Physiology
Metabolism Endocrine
Proteome
Immune Transcriptome
BiomarkerSignatures
Morphometrics
Pharmacokinetics
EthnicityEnvironment
AgeGender
Genetics and Disease Susceptibility
Source: Terry Magnuson, UNC
Science Communities and Outreach
• Communities• Students and educators• Phylogeneticists• Evolutionary biologists• Biomedical researchers• Biostatisticians• Computer scientists• Medical clinicians
Biomedical and Biology, Building Biomedical Communities
• Partners• University of North Carolina• Duke University• North Carolina State University• NSF National Evolutionary
Synthesis Center (NESC)• NIH Carolina Center for
Exploratory Genetic Analysis(CCEGA)
QuickTime™ and aGraphics decompressor
are needed to see this picture.
QuickTime™ and aGraphics decompressor
are needed to see this picture.
Science Gateways Workshop June 28, 2005 19
Neutron Science Gateway•17 instruments
•Users worldwide get “beam time”
•Need access to their data and post processing capabilities
Science Gateways Workshop June 28, 2005 20
Flood Modeling/Homeland Security
Gordon Wells, UT; David Maidment, UT; Budhu Bhaduri, ORNL
Large-scale flooding along Brays Bayou in central Houston triggered by heavy rainfall during Tropical Storm Allison (June 9, 2001) caused more than $2 billion
of damage.
Science Gateways Workshop June 28, 2005 21
OSG / one VO: CMS…
Science Communities and Outreach
• Communities• CERN’s Large Hadron Collider
experiments• Physicists working in HEP and
similarly data intensive scientificdisciplines
• National collaborators and thoseacross the digital divide indisadvantaged countries
• Scope• Interoperation between LHC
Data Grid Hierarchy and ETF• Create and Deploy Scientific
Data and Services Grid Portals• Bring the Power of ETF to bear
on LHC Physics Analysis: Helpdiscover the Higgs Boson!
• Partners• Caltech• University of Florida• Open Science Grid and Grid3• Fermilab• DOE PPDG• CERN• NSF GriPhyn and iVDGL• EU LCG and EGEE• Brazil (UERJ,…)• Pakistan (NUST,…)• Korea (KAIST,…)
LHC Data Distribution Model
Science Gateways Workshop June 28, 2005 22
So how will we meet all these needs?
With RATS! (Requirements Analysis Teams)
Organized RATS Collection, analysis and
consolidation of requirements to jumpstart the work
And milestones
Science Gateways Workshop June 28, 2005 23
Gateways RAT concludes after 2 months
RAT team conducted interviews with all 10 Gateways
Summarized requirements for each TeraGrid working group
Draft a primer outline for new Gateways Organize this workshop to hear from others
Science Gateways Workshop June 28, 2005 24
RAT summary
Community allocations Group accounts / limited privileges Need for portal accounting capabilities, but
little development On-demand scheduling Classifications (3 types)
Portals, desktop apps, access point to other grids User model (3 modes)
Standard, portal, community
Science Gateways Workshop June 28, 2005 25
Actions for wg’s
tg-acctmgmt Support for accounts
with differing capabilities Ability to associate
compute job to a individual portal user
Scheme for portal registration and usage tracking
Support for OSG’s Grid User Management System (GUMS)
Dynamic accounts?
security-wg Define open port ranges Firewalls Community account
privileges Need to identify human
responsible for a job for incident response
Acceptance of other grid certificates
TG-hosted web servers, cgi-bin code
Science Gateways Workshop June 28, 2005 26
Actions for wg’s (2)
Web Services (currently no wg for this)
Needs further study Some Gateways (LEAD,
NMBR) have immediate needs
Many will build on capabilities offered by GT4, but interoperability could be an issue
Web Service security Interfaces to scheduling
and account management are common requirements
software-wg Interoperability of CTSS
and VDT for OSG Software installations
across all TG sites Community software
areas portals-wg
Variety of approaches needs further analysis
OGCE, in-VIGO, Clarens, Neutron Science Tomcat+Apache
TG User Portal
Science Gateways Workshop June 28, 2005 27
Follow on RATs formed Web services RAT–Ivan Judson
GT4
Portal technology RAT –John Cobb OGCE Clarens In-VIGO …
OSG RAT–Stuart Martin OSG/CMS DAC, porting CMS apps to TG resources Job forwarding between gatekeepers Exposing TG resources to OSG …and vice versa !!
Science Gateways Workshop June 28, 2005 28
Gateways Primer Outline 1. Introduction 2. Science Gateway in Context
a. Science Gateway (SGW) Definition(s) b. Science Gateway user modes c. Distinction between SGW and other TeraGrid
user modes 3. Components of a Science Gateway
a. User Model b. Gateway targeted community c. Gateway Services d. Integration with TeraGrid external resources
(data collections, services, …) e. Organizational and administrative structure
4. TeraGrid services and policies available for Science Gateways
a. Portal middleware tools (user portal and other portal tools)
b. Account Management (user models, community accounts, )
c. Security environment (security models) d. Web Services e. Scheduling services (and meta-scheduling) f. Community accounts and allocations g. Community Software Areas h. All traditional TeraGrid services and resources i. Ability to propose additional services and how
that would interact with TeraGrid operations
5. Responsibilities and Requirements for Science Gateways
a. Interaction with and compatibility with TeraGrid communities
b. Control procedures i. Community user identification and tracking
(map TeraGrid usage to Portal user) ii. Use monitoring and reporting iii. Security and trust iv. Appropriate use
6. How to get started a. Existing resources
i. Publication references ii. Web areas with more details iii. Online tutorials iv. Upcoming presentations and tutorials
b. Who to contact for initial discussions c. How to propose a new Gateway d. How to integrate with TeraGrid Gateways
efforts. e. How to obtain a resource allocation
Science Gateways Workshop June 28, 2005 29
Timelines - Fall, 2005
Deploy 3 prototype portals (LEAD, Bioinformatics, Evolutionary Biology)
Define work plan and application characteristics (NVO, nanoHub, Neutron Science)
Port/install software (Homeland Security, Flood Analysis, OSG)
Analyze Gateway needs, plans for OSG integration (TG)
Science Gateways Workshop June 28, 2005 30
Spring, 2006 Explore authentication methods (NVO) Integrate TG compute resources, incl. support for
large scale computing (LEAD, nanoHUB, Bioinf., Evo. Bio., HEP, OSG)
Run a workshop (nanoHUB) Prototypes
web/grid services (Bioinformatics) Data archive hosting (Neutron Science) Data federation models with compute support (Evo. Bio.) Application hosting services, initial compute resource
brokering and data federation. Test for security, scalability (TG)
Code porting and verification (Homeland, Flood Modeling, OSG)
TG/OSG security and accounting mechanisms (TG)
Science Gateways Workshop June 28, 2005 31
Today’s Schedule 10-11:30 Session 1 - Science
portals (1)- Wilkins-Diehr/Goasguen TG Science Gateways (20 min.) CCLRC Bioscience LEAD
<coffee> 12-1:30 Session 2 - Science portals
(2) - Oleksiak Rick Stevens talk (45 min.) TG vis portal nanoHUB
<lunch> 2:30-4 Session 3 - Science portals
(3) – Foster Telescience NAREGI GridSAT ORNL
<break> 4:30-6 Session 4 - Job submission
portals – Lim/Sekiguchi GENIUS PROGRESS DEISA HPC-Europa
<break> 6:30-8 Session 5 - Enabling
technologies – Dolkas GRIA GridASP AAAA InVIGO OGCE
<collapse>