Upload
ellen-wade
View
216
Download
1
Tags:
Embed Size (px)
Citation preview
User Support: Current Levels and Methods
Ralph RoskiesScientific Director
PSCJanuary 9, 2009
User Survey Results
•.2008 user survey satisfaction ratings:Helpfulness of TeraGrid user support staff 83.75%Promptness of ticket resolution 82.25%Effectiveness of user support in solving problems
79.5%
??User support is the most valuable aspect of TeraGrid?? Kelly Gaither- can you dig out the exact quote if you have it, from the user
who said this??(
User Support Overview
Category FTEs (PY4)
Frontline User Support 32.3
Advanced User Support 29.2
Online User Support 5.75
Advanced Support for EOT 4.25
•For Q4 2008, 1,118 PIs, 1,413 users charging SUs.
• In PY4, TG has ~70 FTE involved with user support.
•Managed in concert by the GIG ADs for Operations, User Support Coordination, Advanced Support, User Facing Presence, EOT, and Science Gateways, with guidance from the Science Director.
Note- does not include substantial training efforts in HPC University, or work on Common User Environments.
Frontline User Support
• Ticket Resolution and User EngagementProvide efficient and effective resolution of trouble
tickets by TeraGrid-wide sharing of technical information and best practices.
Refer issues that require >1 FTE-month to
Advanced Support.
Provide technical content for online information
based on recent problems and user feedback.
Provide ongoing personal contacts via the User Champions, Campus Champions, and Pathways programs.
??sergiu- what does this mean??
Organize the 2008 and 2009 user satisfaction surveys.
PY4: 32.3 FTE; $12K for external user survey contractor
Dear Dr. Hackworth,Entire NCCU physics computational group is very grateful for your prompt action and in helping us to resolve for us this very significant problem. (regarding rapid set up of an account) Branislav Vlahovic------------------------------To: R.F. CostaSubject: Re: Pople jobHey Rick,…you are awesome man, thanks for all the help. (setting up priority queue) Abhijit Ramachandran. Depart. Of Bioengineering. U. Texas at Arlington------------------------------…The support staff isextremely patient and helpful.
Keshav Pingali, Cornell
“
Advanced User Support• Advanced Support for TeraGrid Applications
Provide targeted, >1 FTE-months support to users’ application development and optimization efforts.
Responsible for many of the TG HighlightsCan be requested as Startup and Supplemental via POPsOften results in co-authorship, co-Pis in proposals, …
PY4: 15.25 FTE for ~25 ASTA collaborations
Happy New Year!I just wanted to say that Roberto has really been working above and beyond the call of duty to get this all going -- we've been getting mail from him on his way to bed and on waking in the morning with his kids... our runs are all going in this morning, and with any luck at all we'll have a nice set of results to discuss at the AAS in San Diego next week.
Thanks for your help!
Mordecai-Mark Mac Low Merican Museum of Natural History (Jan 3)-------------------------------
“
Advanced User Support
• Advanced Support for ProjectsIdentify, deploy, harden, optimize and benchmark tools and
application packages that benefit large numbers of users in a particular domain or across multiple domains.
Examples include molecular dynamics (NAMD, AMBER, GROMACS, CHARMM, LAMMPS and DESMOND) and materials codes (CPMD, VASP, SIESTA, ABINIT), heavily used in TG.
PY4: 8.25 FTE for at least 3 cross-TG application infrastructure projects
A million thanks to you, and to all the folks at the PSC for your help with this BIG job. Without it, it could not have been done.
Jacobo Bielak, CMU------------------------------Our (consultant) sometimes comes up with his own suggestions before we even have a problem!
Steve Gottlieb (Indiana)
“
Advanced User Support - Gateways
•Subset of Advanced User Support program– Same request process, just looking for different
expertise•Perhaps Grid computing and workflows rather than optimization and scaling–Some may request support in multiple expertise areas
– Targeted support a hallmark of the Gateway program early on•As the program was being formed, all gateway developers were guinea pigs, many received advanced support
•Today, moving toward a more sustainable production environment
PY4: 5.7 FTE for at least 10 SGW projects
Online User Support
• User Information Presentation Develop and maintain methods to provide
users with current, accurate information from across the TeraGrid in a dynamic environment of resources, software and services.
PY4: 3.5 FTE• Information Production and Organization
Maintain and update documentation content, including a knowledge base of brief answers, with follow-up references, to frequent user questions.
PY4: 2.25 FTE for 250 new documents
PSC has exemplary organization for its user guides that simplifies migration to a new machine. All the right information is available for machine parameters, compilation and batch script development. It really reduces the barrier to starting out on a new computer.
Steve Gottlieb Indiana University
“
Support for EOT
• Advanced Support for Education, Outreach and Training Prepare and deliver advanced HPC/CI content for the HPC
University, as well as for education and outreach activitiesFirst 3 quarters of 2008, new contents have included:
•Intro to Multi-Core Programming, •TeraGrid New User Training, •Hybrid Programming for Shared-Memory and Clustered SMP Systems,
•Introduction to Data Transfer and File Management on the TeraGrid ,
•Introduction to Parallel Programming on Ranger, Clouds and Web 2.0
PY4: 4.25 FTE
Backup Slides
Training via HPC University Program•Support for TeraGrid Training
Provide a broad range of live, synchronous and asynchronous training opportunities. Work with external organizations to identify and promote all HPC training resources and opportunities for participation
Over the first three quarters of 2008, TeraGrid has provided training for 5,306 people through 75 training events and through access to 22 on-line tutorials
www.hpcuniv.org
Support for EOT•Support for Education, Outreach and Training Prepare current and future, and significantly larger and more diverse generations, of STEM practitioners to actively contribute to advancing scientific discovery. Over the first three quarters of 2008, EOT has engaged 8,421 people in 190 EOT events, plus use of 22 on-line tutorials, and engagement in over 80 tours of facilities, and through TG’08. AUS staff have contributed significantly in support of these activities.
Current SGW Collaborations•GIG
– GEON and Navajo Technical College
– PolarGrid– Computational
Infrastructure for Geodynamics
– Social Informatics DataGrid
– Allegheny General Hospital
– TeraDRE– HUB gateways– Asteroseismology
•RP– Community Climate
System Model (CCSM)– Neutron Science
Portal– Earth System Grid
•Membership-governed organization – 40 institutional
member, 9 foreign affiliates
•Supports and promotes Earth science by developing and maintaining software for computational geophysics
How does CIG use the TeraGrid?
• Seismograms allow scientists to understand the ground motion• Computationally-intensive simulations run on TeraGrid using an
assortment of 3D and 1D earth models produce synthetic seismograms– Necessary input datasets provided via the portal– Daemon (Python, Pyre) constantly polls the web site looking for work to do
•GSI-OpenSSH and MyProxy credentials to submit jobs, monitors jobs, transfers output back to portal
•status updates to the web site using HTTP POST
– Users can download results in ASCII and Seismic Analysis Code (SAC) format•Visualizations include "beachball" graphics depicting the earthquake's source mechanism, and maps showing the locations of the earthquake and the seismic stations using GMT (http://gmt.soest.hawaii.edu/)
• Researchers quickly receive results and can concentrate on the scientific aspects of the output rather than on the details of running the analysis on a supercomputer
• Future Directions– Parameter explorations– Custom earth models for users
Social Informatics Data Grid
•Heavy use of “multimodal” data. – Subject might be viewing a
video, while a researcher collects heart rate and eye movement data.
•Events must be synchronized for analysis, large datasets result
•Extensive analysis capabilities are not something that each researcher should have to create for themselves.
http://www.ci.uchicago.edu/research/files/sidgrid.mov
How does SIDGrid use the TeraGrid?
•Computationally intensive tasks– Speech, gesture, facial expression, and physiological
measurements •Media transcoding for pitch analysis of audio tracks•Once stored in raw form, data streams converted to formats compatible with software for annotation, coding, integration, analysis
– fMRI image analysis
•Workflows for massive job submissions and data transfers using Virtual Data System (VDS)
•Worflows converted to concrete execution plan via Pegasus Grid planner– TeraGrid information service (MDS)– Replica location service (RLS)– DAGMAN and Condor-G/GRAM
Purdue ASTA - TG-MCA05S015 18
Purdue ASTA Activity – TG-MCA05S015
P. A. Cheeseman ([email protected])
Teragrid AllocationsTG-MCA05S015TG-MCA05T015
Purdue ASTA - TG-MCA05S015 19
• Milestones
• 2006/02 – Adaptation of parameter sweep to Condor began.
• 2006/05 – Condor adaptation plan reviewed.• Reduce job times to avoid preemption.• Improve fault tolerance.• Incorporate internal, adaptable, time limits.• Incorporate script level steps within program
(self-checkpoint, seed iteration, etc.).
• 2006/08 – Program adaptation complete and adapted code in production (see Slide 4).
• 2007/06 – Presentation at TG07 (http://www.teragrid.org/events/teragrid07/archive/presentations/wednesday/TG07.PD.12)
Purdue ASTA Activity ...
Purdue ASTA - TG-MCA05S015 20
• Milestones (cont.)• 2007/12/24 – Initial computations complete.
• ~6M jobs completed.• 4M hours delivered.• 240+ hours/Hour average delivery rate• Peak rates of 2000+ hours/hour.• 3,168,459 parameter sets processed (100 seeds
per set).
• 2008/01 – Refinement computations began.• Minor code adaptations necessary.• Less CPU intensive.
• 2008/11 – Refinement computations complete.• 8M+ inputs processed.
• Results presently being reviewed By Profs. Deem and Earl.
Purdue ASTA Activity ...
Purdue ASTA - TG-MCA05S015 21
Unadapted Execution Times
0
100
200
300
400
500
600
700
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Time (hours)
Jobs
Unadapted Execution Time Breakdown
20 min. or less3%
20-40 min.6%
40 min. to 1 hr.7%
1-2 hr.24%
2-3 hr.21%
3 hr. or more39%
Adapted Execution TimesFive Jobs per Set
0
2000
4000
6000
8000
10000
12000
14000
0 1 2 3
Time (hours)
Jobs
Adapted Execution Time BreakdownFive Jobs per Set
1 hr. or less93%
1-2 hr.6%
2-3 hr.1%
0%
Purdue ASTA Activity ...