32
Introduction of NISN and Bioinformatics applications Seok Jong Yu, Ph.D. High-Performance Biocomputing Team / NISN

Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Embed Size (px)

Citation preview

Page 1: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Introduction of NISN andBioinformatics applications

Seok Jong Yu, Ph.D.

High-Performance Biocomputing Team / NISN

Page 2: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Contents

• Introduction of KISTI and NISN

• Resource and Services

• Bioinformatics applications

• Conclusion

Page 3: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

About KISTIKorea Institute of Science Technology and Information

Page 4: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services
Page 5: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

About KISTI

President

National Nano-Technology Policy Center

Div. of Advanced Information Research

Div. of P

olicy Research

Information S

ervice Center

Softw

are Research C

enter

Technology Information A

nalysis Center

Industry Information A

nalysis Center

SM

B K

nowledge S

upport Center

Div. of P

lanning

Div. of A

dministration

Div. of Information Analysis

NTIS

Center

Supercom

puting Service C

enter

Supercom

puting R&

D C

enter

KR

EO

NE

T Center

Dept. of S

upercomputing S

trategy

National Institute of Supercomputing and Networking

Page 6: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Employees status

• Total 529 Employees (Jan. 2013)

• 120 million$ budget

Government General Fund: 72% Others: 28%!

Personnel(21%) Research(64%) Others(14%) !

Revenue!

Expenses!

US$86! US$34!

US$20! US$58! US$18!

(unit: million)!

(unit: million)!

Researcher!

79%!

Admin.!Techni-cian!

Others!

12%!

6%!3%!

Page 7: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

About NISNNational Institute of Supercomputing and Networking

Page 8: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Organization of NISN

Dept. of S

upercomputing Infrastructure

Operation

Dept of Supercomputing Service Integration

Dept of S

upercomputing U

ser Support

Dept. of Global Science Experim

ent Data Hub

Dept. of S

upercomputing Technology

Developm

ent

Dept. of A

dvanced Application E

nvironment

Developm

ent

High-P

erformance B

iocomputing Team

Advanced Visualization Team

Dept of K

RE

ON

ET S

ervice

Dept of S

cience & Technology S

ecurity

Netw

orking Service D

evelopment Team

Sm

art Education S

ervice Team

Supercomputing Service Center Supercomputing R&D Center KREONET Center

National Institute of Supercomputing and Networking

Dept. of Supercomputing Strategy

Page 9: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Personnel : 200 people

Student & Intern3%

Project Based43% Permanent

55%

Form of Employment

Bachelor23%

Master42%

Doctor36%

Degree Obtaioned

Budget : ~48.2Million USD

Page 10: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Resource and Service

Page 11: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Computing Resource

1988 1993 1997 2001

[KISTI-1] Cray 2S

[KISTI-2+] Cray T3E

2002 2003

2 Flops 115 GFlops 320 GFlops

4.4 TFlops 16 GFlops

2GF 16GF 131GF 5.2 TF 2008 2011

360 TF

36 TFlops

324 TFlops

▣ History of KISTI Supercomputers

[KISTI-2] Cray C90

[KISTI-3] NEC SX-5/6

[KISTI-3] IBM p690

[KISTI-4] SUN Blade 6048

[KISTI-4] IBM p595

2009

Page 12: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Computing Resource

Tachyon(SUN)�

�'!0$��� �'!0$��

�!,2%!"12/$� �����*!#$� ����

�/"'(1$"12/$� "*201$/�

�/-"$00�+-#$*� ������!/"$*-,!�� �,1$*���$'!*$+��

��-%��-#$0� ����,-#$0� ����,-#$0�

��-%�����"-/$0������

�� �.$/�,-#$��

�� ���

���.$/�,-#$��

�.$!)�����*-.0� �����*-.0�

����*-.0�

�-1!*��$+-/4� ��� � ���

�(0)��1-/!&$� ����� �����

�!.$��1-/!&$� ���� ���

�,1$/"-,,$"1(-,��$13-/)� ���������� ����������

������*!#$� ��� �

▣ Hardware Specification : Tachyon

!  Cluster system !  Ranked at 14th in top500 in Nov. 2009

Page 13: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Computing Resource

Gaia(IBM)�

�)#2&�� �)#2&��

�#.4'#$341&� ����0 � � ����0 � �

�1$)*3&$341&� ����

�1/$&22�-/%&,� ��!�� �� ��!����

��/'��/%&2� ��./%&2� ��./%&2�

��/'��� �$/1&2�����

����0&1�./%&��

� ���

����0&1�./%&��

�0&#+� ����,/02� ������,/02�

������,/02�

�/3#,��&-/16� ����� �����

�*2+��3/1#(&� ����� �����

�.3&1$/..&$3*/.��&35/1+� ���� ����"�����

▣ Hardware Specification : Gaia

!  Cluster of SMPs !  Memory intensive computing system for massive parallel jobs !  Ranked at 393th in top500 in Nov. 2009

Page 14: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Networking Resource

IP Network + Optical Private Network 24*365 Network Monitoring by KREONET-NOC

Performance Enhancement & Sharing Operating Information Security Service by CERT-KREONET

Page 15: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Global Network Service

GLObal RIng Network for Advanced Applications Development

!  GLORIAD is the world’s first 10Gbps global R&D network connecting the

entire world with ring-shaped optical network across 15 countries. (Secure

2.5G backup link)

!  Support the data transfer for international-class R&D collaborative research

Page 16: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Bioinformatics Applications

Page 17: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Bioworks• Bioworks is workflow system based on the client-

server architecture

• client tool provides a module that allows constructing workflows

• Bioworks engine on sever side runs the submitted workflows on HPC resourcesBioworks Client

HPC resources Storage resources

Workflow XML File

Page 18: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Bioworks

Page 19: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Bioworks

• Integrated Result Analysis Browser

• provides plug-in GUI components which allow integrative analysis with visualizing tools

Page 20: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Genomics analysisInput&raw&data&

(.fastq)&

Mapping&to&the&&reference&

Execute&cufflink&

Save&GFF3&file&to&&MongoDB&

Execute&&cuffmerge&

DEG&finding&with&cuffdiff&

Convert&BAM&to&&GFF3&Format&

Convert&GTF&to&&GFF3&format&

Mapped&result&(BAM)&

Transcription&Information&File&(gtf,&binary&files,&annotation&

file)&

Transcription&Result&File&&

(GTF)&

gtf&&&annotation&file&

DEG&Result&(GTF)&

Reference&indexing&db&&

! RNA-seq&analysis&tools&�������������������

!  ������������������ �����������������

! ��� ��� �� ����������MongoDB&

Page 21: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

insilicoCell

• Client tool supports designing a biological reaction model

• It can import the BioModel database and SBML file

• It simulate the model using ODE tools and Boolean network analysis

• Simulation engine is implemented and installed on server side.

Design of a novel signal transduction pathway

Page 22: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

insilicoCellArchitecture of insilicoCell

Page 23: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

insilicoCellExamples of a kinetic modeling

Page 24: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Text Mining Tool (BioKnowledge Viewer)• Constructing biological networks using text-ming tool with Pubmed data.

• User can generate own biological interaction network by using on-demand analysis service in NISN

PUBMEDAbstracts

Sentence selector

Relation extractor Rule Sets

Information elementrecognizer

Results

MetaMap

Gene/Protein tagger ABNER

Page 25: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Data Mining Tool and Service

Whole network navigation

Page 26: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Data Mining Tool and Service

Interactive network navigation

Page 27: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

SimFlu

• Simulating variation pattern of influenza virus based on genome data for last 10 years from NCBI GenBank.

• Creating variation matrix based on codon variation of influenza virus

• Predicting a virus genome in the future base on simulation

Page 28: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

International Collaboration

• NBCR  :  OPAL  development  and  enhancement  job

• HongKong  Univ.  :    Alzheimer’s  disease  research

• UTM  /  UI  /  USM  :  data  mining  of  signal  transducJon

Page 29: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Collaboration with Domestic group

• KIOM  :  constructs  anJ-­‐inflammaJon  signaling  pathway

• KNIH  :  Text-­‐mining  for  Aging  research

• Korea  Ginseng  Corp:  ConstrucJng  ginseng  related  biological  networks

• Korea  Cancer  Center  :  Microarray  analysis  for  cancer  process  

• Amore  Pacific  Corp.  :  melanogenesis  research

• KAIST  :  ConstrucJng  a  NGS  analysis  pipeline

Page 30: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Conclusion

Page 31: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Activities in NISN

• NISN is established and managing the HPC resource and networking services

• Biocomputing team have been developing cyber-research environment for biologist who can’t use UNIX environment.

• Bioworks, insilicoCell, and simFlu have been developed in NISN and used as cyber-research framework and analysis tools.

• We welcome to collaborate with you for constructing a novel cyber research environment on biology field.

Page 32: Introduction of NISN and Bioinformatics applications web/NISN-bioinfo_sjyu.pdf · Activities in NISN • NISN is established and managing the HPC resource and networking services

Thank you

for your attention