Upload
randall-flynn
View
220
Download
1
Embed Size (px)
Citation preview
1
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
www.eu-eela.org
E-Infraestructure shared between Europe and Latin America
José Manuel Gutiérrez
EELA is project funded by the European Union under contract 026409
EELA
Applied Meteorology Group
http://www.meteo.unican.es
High-Performance GRID Computing. Activities within EELA Project in
Biomedicine and Climate
2nd International Seminar on Genomics, Proteomics and Bioinformatics
Popayán (Colombia), 25-27 oct. 2006.
2
E-infrastructure shared between Europe and Latin America
www.eu-eela.org Local clusters in Santander
3
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
Surgeryplanning &
visualisation
Floodingcontrol
MIS
HEPdata
analysis
weather &pollutionmodelling
level 1 - special hardware
40 MHz (40 TB/sec)level 2 - embedded processorslevel 3 - PCs
75 KHz (75 GB/sec)5 KHz (5 GB/sec)100 Hz(100 MB/sec)data recording &offline analysis
Task 1.0: Co-ordination & management
Aplicaciones
En distintas disciplinas existen problemas que requieren computación de alto rendimiento a través de paralelización de procesos y/o de ejecución de múltiples trabajos.
4
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
CrossGrid - International Testbed Organisation
UCY NikosiaDEMO Athens
Auth Thessaloniki
CYFRONET Cracow
ICM & IPJ Warsaw
PSNC Poznan
II SAS Bratislava
FZK Karlsruhe
UvA Amsterdam
CSIC Valencia
UAB Barcelona
CSIC Santander
CSIC Madrid
LIP Lisbon
USC Santiago
TCD Dublin
CrossGrid Project
5
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
• Desarrollada a mediados de los noventa.
• Utilización de recursos computacionales distribuidos, heterogéneos, dinámicos y, de forma habitual, paralelos.
• Globus Toolkit y OGSA — software intermedio (middleware) y estándar para construir aplicaciones.
• Diversos proyectos de investigación y productos comerciales desarrollando esta tecnología.
Sería fantástico que la potencia de cómputo estuviese disponible de la misma manera que la electricidad (grid) (Ian Foster).
Computación GRID
6
E-infrastructure shared between Europe and Latin America
www.eu-eela.org Estructura del GRID
Grid Resource
Allocator Manager
Monitoring and
Discovering Sys.
Grid Resource
Inf. Service
Grid Index
Inf. Service
GridFTP
7
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
UI
Broker
Optimal Resource AllocationReplica/Data Manager?
WEB SERVICE FINDER?
Master
Slave CESESESESESE CACHED
Ejemplo de "job"
8
E-infrastructure shared between Europe and Latin America
www.eu-eela.orgEELA. Goal and Objectives
E-infrastructure shared between Europe and Latin America
• Goal: To build a bridge between consolidated e-Infrastructure initiatives in Europe and emerging ones in Latin America.
• Objectives: Establish a human collaboration network between
Europe and Latin America Setting a pilot e-infrastructure in Latin America Identifying and promoting a sustainable framework
for e-Science in Latin America
9
E-infrastructure shared between Europe and Latin America
www.eu-eela.org Partners
Spain: CIEMAT, CSIC, UPV, RED.ES, UC
Italy: INFN
Portugal: LIP
International:CLARA
CERN
EU
Latin AmericaVenezuela: ULA
Cuba: CUBAENERGIA
Chile: UTFSM, REUNA, UDEC
Peru: SENAMHI
Mexico: UNAM
Argentina: UNLP
Brazil: UFRJ, CNEN,
CECIERJ/CEDERJ, RNP, UFF
10
E-infrastructure shared between Europe and Latin America
www.eu-eela.org Structure
WP2. Pilot testbed operation and supportGEANT, RedCLARA and European and Latin American NRENs will
provide the network infrastructure. The grid infrastructure will be based on the EGEE middleware framework .
WP1. Project administrative and technical management
WP3. Identification and support of Grid-Enhanced applications
WP4. Dissemination activities
11
E-infrastructure shared between Europe and Latin America
www.eu-eela.orgWP3. Applications
Task 3.1. Biomed Applications
Task 3.2. HEP Applications
Task 3.3. Additional Applications:E-LearningClimate
Deliverable D3 .1 .1. Selection Report Biomedicine and HEP Applications
12
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
• Context:– The biomedical applications being deployed on the pilot EELA
infrastructure have been identified from current existing ones already in use in EGEE, and from the expertise and research activity of the LA and EU partners in EELA.
– The target of the biomedical part of EELA is to deploy Grid applications for the biomedical LA community to improve their research excellence and to foster the use of Grids in this community.
– Applications are selected considering their relevance for LA partners from the portfolio of existing and new applications.
• Project:– Two applications from the portfolio of mature EGEE biomedical
applications have been selected by LA partners: GATE and WISDOM.– Two new applications were identified from the specifics needs of LA
partners: BLAST and Phylogenetics.– EELA has joined the Ibero-American Portal of Bioinformatics.
Biomedical Applications
13
E-infrastructure shared between Europe and Latin America
www.eu-eela.org GATE
• GATE: Géant4 Application for Tomographic Emission– GATE is a C++ platform based on the Monte Carlo Geant4
software designed to model nuclear medicine applications (PET, SPECT). This platform is also adequate for radiotherapy and brachytherapy treatment planning.
– The objective of GATE is to use the Grid environment to reduce the computing time of Monte Carlo simulations in order to provide higher accuracy in a reasonable period of time.
– The main benefit of using the Grid is that it has enabled medical users to access to realistic Monte Carlo simulations for their research in radiotherapy planning. The EELA Grid provide of enough computational resources to deal with the large requirements that this processing has.
– GATE is already installed on several EELA’s partners sites.
14
E-infrastructure shared between Europe and Latin America
www.eu-eela.org WISDOM
• WISDOM: Wide In Silico Docking On Malaria– The objective of WISDOM is the creation of new inhibitors for a
family of proteins produced by Plasmodium falciparum. This protozoan parasite causes malaria.
– This application consists on the deployment of a high throughput virtual screening in the perspective of in silico drug discovery for neglected diseases.
– Interest of EELA partners: selection of new targets for malaria; study of new targets for new parasitory diseases; and contribution with resources for the WISDOM data challenge.
– The benefit of Grids is the reduction of the development cycle of new drugs for neglected diseases by providing in silico simulations of the selection of the adequate reactors for specific targets and the needed infrastructure to deal with the computational power required.
15
E-infrastructure shared between Europe and Latin America
www.eu-eela.org PHYLOGENY (MrBayes)
• Phylogeny with MrBayes program: – A phylogeny is a reconstruction of the evolutionary history of a
group of organisms.– Bayesian inference is a powerful mathematical method which is
implemented in the MrBayes program for estimating phylogenetic trees that are based on the “a posteriori” probability distribution of the trees.
– The phylogenetic tools are widely demanded by LA bioinformatics community.
– A Grid service for the parallelised version of MrBayes application will be developed and a simple interface will be deployed on the Ibero-American Portal of Bioinformatics. This Grid-enabled service will make use of EELA resources to run phylogenetic studies at high performance.
16
E-infrastructure shared between Europe and Latin America
www.eu-eela.org BLAST
• BLAST: Basic Local Alignment Searching Tool– BLAST finds regions of local similarity between sequences. The
program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches.
– The process of finding homologous of sequences is computionally-intensive. The size of available non-redundant databases increases daily. Since databases are periodically updated, the periodically update of the previous studies is convenient.
– The use of Grid will allow to increase the number of fragments to be analysed and the periodical update of this information.
– A Grid service for running MPIBlast on the EELA grid, and using the Ibero-American portal of Bioinformatics (CECALC-ULA), has been developed.
IST-2006-026409 www.eu-eela.org
E-infrastructure shared between Europe and Latin America
Blast in Grids (BiG)
Ignacio Blanquer
Universidad Politécnica de Valencia
18
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
• BLAST (Basic Local Alignment Search Tool) is a Bioinformatics Procedure Applied to Identify Compatible Protein and Nucleotids Sequences in Protein and DNA Databases.
• BLAST can be Applied, Among Other Uses, to Annotate the Estimated
Function of Unknown
Sequences.• BLAST is Computationally
Intensive.
19
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
• BLAST in Grids (BiG)– Grid Interface to MPI Blast. – Access Through a Web Portal (http://portal-bio.ula.ve/).– Access to EELA Grid Through Gate-to-Grid Using a Web
Service Rersource Framework Interface.
WEB Environment EELA Grid InfrastructureSE aker.dsic.upv.es
WNs
CE ramses.dsic.upv.es
Bioinformatics Portal
Gate-to-Grid
FASTAFile
(Input Sequence)
AGTACGTAGTAGCTGCTGCTACGTGGCTAGCTAGTACGTCAGACGTAGATGCTAGCTGACTCGA
FASTAFile
(Input Sequence)
AGTACGTAGTAGCTGCTGCTACGTGGCTAGCTAGTACGTCAGACGTAGATGCTAGCTGACTCGA
ExecutionParameters
ExecutionParameters
Protein Database
(Non Redundant e.g.)
Protein Database
(Non Redundant e.g.)
Output Matches
Xxxxx x x x x x xxx xx xxx x
Output Matches
Xxxxx x x x x x xxx xx xxx x
20
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
• Design Objectives– Easy Interface with High Compatibility (Web Service + NCBI Based)
Same Parameters as BLAST. User-friendly and Intuitive.
– Support to Searching Simultaneously on Multiple Databases Parallel Process on Multiple Database Queries.
– Architecture Exportable to Other Common Problems Modular Structure of the System Components. Fast Capability to Migrate to Other Problems.
– Scalability Data Partition in Grid Approach Gives Scalability with
Huge Quantities of Data.
– High Performance Grid Computing + MPI Parallel Jobs in Dedicated
Clusters.
– Robust Fault Tolerance on Server and Client.
21
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
• Hosted by the Ibero-American Portal of Bioinformatics (http://portal-bio.ula.ve) installed on the National Centre for Scientific Computation of the Universidad de Los Andes in Venezuela. – The Application is Available Through the Bioinformatics Portal of
CeCalcULA, Being Accessible for Registered Users. http://www.cecalc.ula.ve/blast/
– This portal also provides several on-line applications for registered users. It currently has almost 600 registered users from 70 countries (although 90% come from 10 countries).
• The Service is Also Being Used by the Genomic Centre of the Valencian Institute of Research on Agriculture (Centro de Genómica, Instituto Valenciano de Investigaciones Agrarias)
• Executions– 309 Runs Since June 2006.– 3200 CPU Hours (133) Consumed.
22
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
• Alineamiento con BLAST
23
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
24
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
25
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
26
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
demo
msraalrlkipmpatmadfafpslrafsivvaldkqhgigdgesipwrvpedmaffkdqttllrnkkpptekkrnavvmgrktwesvpvkfrplkgrlnivlsskatveellaplpegkraaaaqdvvvvngglaealrllarppycssietaycvggaqvyadamlspcveklqevyltriyttapactrffpfppentttawdlassqgrrkseadglefeickyvprnheerqylel
1
demo
msraalrlkipmpatmadfafpslrafsivvaldkqhgigdgesipwrvpedmaffkdqttllrnkkpptekkrnavvmgrktwesvpvkfrplkgrlnivlsskatveellaplpegkraaaaqdvvvvngglaealrllarppycssietaycvggaqvyadamlspcveklqevyltriyttapactrffpfppentttawdlassqgrrkseadglefeickyvprnheerqylel
27
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
demo
28
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
Vicente Hernández, Ignacio Blanquer
Universidad Politécnica de Valencia
Camino de Vera s/n
46022 Valencia, Spain
Tel: +34-963879743
Fax. +34-963877274
E-mail: [email protected]
29
E-infrastructure shared between Europe and Latin America
www.eu-eela.orgClimate Models
Conservación de energía, masa, momento, vapor de agua,
ecuación de estado de gases.
360x180x32 x nvar
v = (u, v, w), T, p, = 1/ y q
30
E-infrastructure shared between Europe and Latin America
www.eu-eela.org ESG Home
31
E-infrastructure shared between Europe and Latin America
www.eu-eela.org Subsetting List
32
E-infrastructure shared between Europe and Latin America
www.eu-eela.org
Grid+OpenDAPTransparencyPerformanceTypical Application
Data(local)
netCDF lib
Application
Data(remote)
OpenDAP Client
Application
OpenDAPViahttp
Big Data(remote)
ESG client
Application
ESGGrid +DODS
OpenDAP Server ESG Server
Distributed Application
dataOpenDAP
ViaGrid
SecurityResource MgmtAnalysis functions