8
Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute [email protected]

Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute [email protected]

Embed Size (px)

Citation preview

Page 1: Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute mcgee@renci.org

Biomedical and Bioscience Gateway to

National Cyberinfrastructure

John McGeeRenaissance Computing Institute

[email protected]

Page 2: Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute mcgee@renci.org

Meeting the needs of individuals

•Many individuals …

•No single science environment fits all patterns of R&E– Education and entry level research– Domain scientists with little large scale IT expertise– Domain scientists with significant IT expertise

•Integrating large scale National CI with science applications is often non-trivial– Policy, Technology, Security, History– Even for domain scientists with significant IT expertise

Page 3: Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute mcgee@renci.org

Meeting the needs of individuals

•RENCI is taking a four tier approach to Gateway activities:– Web Portal environment– Workflow development environment with supporting deployed

infrastructure– Client applications that consume RENCI hosted TG enabled web

services– Adaptation of a research team’s existing job management scripts

•Working with key research teams to– Provide usage scenarios to drive the development of the

infrastructure– Demonstrate success and entice collaborations with new teams

Page 4: Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute mcgee@renci.org

Meeting the needs of individuals

•RENCI’s role as a Gateway: Shield or Bootstrap– Depending on how the research group wants to collaborate– Use the appropriate tool for the scenario:

•portal, workflow, web services, job management adaptation

– Example: assist a team with a well developed domain science application to become a self sufficient Science Gateway of their own•iFold: A web portal for interactive protein folding simulations

– Example: Shield a team from detailed knowledge of the underlying infrastructure by adapting their job management scripts and maintaining some key infrastructure

•Deploy and maintain infrastructure to support the program– TG back-ended web services, bio-databases, strongly typed data

representations and ontologies, resource selection and meta-scheduling, etc.

Page 5: Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute mcgee@renci.org

>140 Bioscience Applications

Simple form fill and submit to run bioscience jobs on TeraGrid

Guided tour through popular Bioscience applications

Most successful with Education segment. Used in two graduate level Biology courses.

250 registered users. On average, about 10% are active per month.

Completed workflows embedded into the portal are beginning to pique the interest of research teams.

Page 6: Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute mcgee@renci.org

Workflow and Web Service Hosting

Taverna

BioMartBioMobyHosted Web Services

Deploy, config, maintain Bioscience backend compute/data components on TG

Page 7: Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute mcgee@renci.org

Job Management AdaptationProtein Design Application

24 hours to complete 3,000 jobs, roughly 300 jobs active at a time

Used the researchers’ existing job management scripts, modified by RENCI, with submission to a RENCI hosted service, which managed the workload of jobs out onto the grid

•The compute jobs for this specific example were not serviced by TeraGrid, however they were serviced by a grid infrastructure, and this is representative of the model we are applying to TeraGrid

Working with six research teams across three universities

Page 8: Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute mcgee@renci.org

Conclusions

•Enable science by meeting the immediate compute/data needs of research teams, while building meaningful infrastructure that can help transform future activities and enable new modalities of research

•Educate a new generation of biomedical and bioscience researchers

•Reach a diverse community with a diverse set of tools, best practices, and deployed infrastructure

•Leverage success, which is highly viral

•Shield or educate researchers as appropriate on the intricacies of large scale distributed IT and HPC systems in support of science