50
1 Bioinformation Technology (BIT) and Biointelligence (BI) [ 바이오정보기술(BIT)과 바이오지능(Biointelligence)] A tutorial to be presented at 2001 Spring Conference of Korea Information Science Society (KISS) Byoung-Tak Zhang School of Computer Science and Engineering Seoul National University E-mail: [email protected] http://scai.snu.ac.kr./~btzhang/ This material is available at http://scai.snu.ac.kr/~btzhang/ 2 Outline ? Introduction ? Bioinformation Technology (BIT) = BT + IT ? Bioinformatics, Biocomputing, Biochips ? Biointelligence = BT + AI ? Concept, Methodology, Technology ? Applied Biointelligence ? Summary ? Further Information

Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

1

Bioinformation Technology (BIT) and Biointelligence (BI)

[바이오정보기술(BIT)과 바이오지능(Biointelligence)]

A tutorial to be presented at 2001 Spring Conference of Korea Information Science Society (KISS)

Byoung-Tak ZhangSchool of Computer Science and Engineering

Seoul National University

E-mail: [email protected]://scai.snu.ac.kr./~btzhang/

This material is available at http://scai.snu.ac.kr/~btzhang/

2

Outline

? Introduction

? Bioinformation Technology (BIT) = BT + IT

? Bioinformatics, Biocomputing, Biochips

? Biointelligence = BT + AI

? Concept, Methodology, Technology

? Applied Biointelligence

? Summary

? Further Information

Page 2: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

2

3

Introduction

4

Biotechnology Revolution

Year

2000

Biotechnology Age

1950

Information Age

AD 1760

Industrial Age

Econom

ical Value

Agricultural Age

BC 6000

Page 3: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

3

5

Human Genome Project

Genome Health Implications

A New DiseaseEncyclopedia

New Genetic Fingerprints

NewDiagnostics

NewTreatments

Goals•Identify the approximate 100,000 genesin human DNA

•Determine the sequences of the 3 billionbases that make up human DNA

•Store this information in database•Develop tools for data analysis•Address the ethical, legal and social issues that arise from genome research

6

Bioinformation Technology (BIT)= BT + IT

BTIT

In silico Biology (e.g. Bioinformatics)

In vivo Informatics (e.g. Biocomputing)

Page 4: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

4

7

Bioinformation TechnologyBioinformaticsBiocomputing

Biochips

8

Bioinformatics

Page 5: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

5

9

What is Bioinformatics?

? Bioinformatics vs. Computationl Biology? Bioinformatik (in German): Biology-based computer

science as well as bioinformatics (in English)

Informatics – computer scienceBio – molecular biology

Bioinformatics – solving problems arising from biology using methodology from computer science.

10

What is DNA?

AACCTGCGGAAGGATCATTACCGAGTGCGGGTCCTTTGGGCCCAACCTCCCATCCGTGTCTATTGTACCCGTTGCTTCG

GCGGGCCCGCCGCTTGTCGGCCGCCGGGGGGGCGCCTCTGCCCCCCGGGCCCGTGCCCGCCGGA GACCCCAACAC

GAACACTGTCTGAAAGCGTGCAGTCTGAGTTGATTGAATGCAATCAGTTAAAACTTTCAACAATGGATCTCTTGGTTCCGGCATGCAATCAGTCCCGTTGCTTCGGCACTGTCTGAAAGCGCCTTTGGGCCCAACCTCCCATCCGTGTCTATTGTACCCG

TTGCTTCGGCGGGCCCGCCGCTTGTCGGCCGCCGGGGGGGCGCCGTTGCTTCGGCGGGCCCGCCGCTTGTCGGCCGCCGGGGCTATTGTACCCGTTGCTTCGGATCTCTTGGGGATCTCTTGGTTCCGGCATGCAATCAGTCCCGTTGCTTCGGC

ACTGTCTGAAAGCGCCTTTGGGCCCAACCTCCCACCGTTGCTTCGGCGGGCCCGCCGCTTGTCGGCCGCCGGGGGGG

CGGCCGCCGGGGGCACTGTCTGAAAGCTCGGCCGCC

Page 6: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

6

11

The Structure of DNASugar-phosphate

backbone

HydrogenbondsBase

?RNA consists of A, C, G, and U, where U plays the same role as T ?Watson-Crick complementary pairs:

?A and T (or A and U) ?C and G

?Hybridization: when 2 strands of complementary DNA (or one strand of DNA and one strand of complementary RNA) stick together

12

Molecular Biology: Flow of Information

DNA RNA Protein Function

DNA

PheCysLysCysAspCysArgSerA

laLeu

Protein

AC

TG

GA A

GCT

TA

TC

Page 7: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

7

13

DNA (gene) RNA Protein

controlstatement

TATA start

Terminationstop

controlstatement

Ribosomebinding

gene

Transcription (RNA polymerase)

mRNA

Protein

Transcription (Ribosome)

5’ utr 3’ utr

14

Nucleotide and Protein Sequence

aacctgcgga aggatcattaccgagtgcgg gtcctttgggcccaacctcc catccgtgtctattgtaccc tgttgcttcggcgggcccgc cgcttgtcggccgccggggg ggcgcctctgccccccgggc ccgtgcccgccggagacccc aacacgaacactgtctgaaa gcgtgcagtctgagttgatt gaatgcaatcagttaaaact ttcaacaatggatctcttgg ttccggctgc tattgtaccc tgttgcttcggcgggcccgc cgcttgtcggccgccggggg ggcgcctctgccccccgggc ccgtgcccgccggagacccc tgttgcttcggcgggcccgc cgcttgtcggccgccggggg cggagacccc

gcgggcccgc cgcttgtcggccgccggggg ggcgcctctgccccccgggc ccgtgcccgcaacctgcgga aggatcattaccgagtgcgg gtcctttgggcccaacctcc catccgtgtctattgtaccc tgttgcttcggcgggcccgc cgcttgtcggagttaaaact ttcaacaatggatctcttgg ttccggctgc tattgtaccc tgttgcttcggcgggcccgc cgcttgtcggccgccggggg ggcgcctctgccccccgggc ccgtgcccgccggagacccc tgttgcttcggcgggcccgc cgcttgtcggccgccggggg cggagacccc gcgggcccgc cgcttgtcggccgccggggg ggcgcctctg

cgcttgtcgg ccgccgggggccccccgggc ccgtgcccgccggagacccc aacacgaacactgtctgaaa gcgtgcagtctgagttgatt gaatgcaatcagttaaaact ttcaacaatggatctcttgg aacctgcggaccgagtgcgg gtcctttgggcccaacctcc catccgtgtctattgtaccc tgttgcttcggcgggcccgc cgcttgtcggccgccggggg ggcgcctctgagttaaaact ttcaacaatggatctcttgg ttccggctgc tattgtaccc tgttgcttcggcgggcccgc cgcttgtcggccgccggggg ggcgcctctgccccccgggc ccgtgcccgccggagacccc tgttgcttcg

SQ sequence 1344 BP; 291 A; C; 401 G; 278 T; 0 other

DNA (Nucleotide) Sequence

CG2B_MARGL Length: 388 April 2, 1997 14:55 Type: P Check: 9613 .. 1

MLNGENVDSR IMGKVATRAS SKGVKSTLGT RGALENISNV ARNNLQAGAK KELVKAKRGM TKSKATSSLQ SVMGLNVEPM EKAKPQSPEP MDMSEINSAL EAFSQNLLEG VEDIDKNDFD NPQLCSEFVN DIYQYMRKLE REFKVRTDYM TIQEITERMR SILIDWLVQV HLRFHLLQET LFLTIQILDR YLEVQPVSKN KLQLVGVTSM LIAAKYEEMY PPEIGDFVYI TDNAYTKAQI RSMECNILRR LDFSLGKPLC IHFLRRNSKA GGVDGQKHTM AKYLMELTLP EYAFVPYDPS EIAAAALCLS SKILEPDMEW GTTLVHYSAY SEDHLMPIVQ KMALVLKNAP TAKFQAVRKK YSSAKFMNVS TISALTSSTV MDLADQMC

Protein (Amino Acid) Sequence

Page 8: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

8

15

Some Facts

? 1014 cells in the human body.? 3.109 letters in the DNA code in every cell in your

body.? DNA differs between humans by 0.2%, (1 in 500

bases).? Human DNA is 98% identical to that of

chimpanzees.? 97% of DNA in the human genome has no known

function.

16

EMBL Database Growth

0

1

2

3

4

5

6

7

8

9

10

1982 1984 1986 1988 1990 1992 1994 1996 1998 2000year

millio

ns o

f record

s

total number of records (millions)

Page 9: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

9

17

Bioinformatics Is About:

? Elicitation of DNA sequences from genetic material

? Sequence annotation (e.g. with information from experiments)

? Understanding the control of gene expression (i.e. under what circumstances proteins are transcribed from DNA)

? The relationship between the amino acid sequence of proteins and their structure.

18

Background of Bioinformatics

? Biological information infra? Biological information management systems? Analysis software tools? Communication networks for biological research

? Massive biological databases? DNA/RNA sequences? Protein sequences

? Genetic map linkage data? Biochemical reactions and pathways

? Need to integrate these resources to model biological reality and exploit the biological knowledge that is being gathered

Page 10: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

10

19

Extension of BioinformaticsConcept ? Genomics

? Functional genomics? Structural genomics

? Proteomics: large scale analysis of the proteins of an organism

? Pharmacogenomics: developing new drugs that will target a particular disease

? Microarry: DNA chip, protein chip

20

Applications of Bioinformatics

? Drug design? Identification of genetic risk factors? Gene therapy? Genetic modification of food crops and animals? Biological warfare, crime etc.

? Personal Medicine?? E-Doctor?

Page 11: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

11

21

SNP (Single Nucleotide Polymorphism)Finding single nucleotide changes at specific regions of genes

?Diagnosis of hereditary diseases?Personal drug?Finding more effective drugs and

treatments

22

Problems in Bioinformatics

Structure analysis? Protein structure comparison? Protein structure prediction ? RNA structure modeling

Pathway analysis? Metabolic pathway? Regulatory networks

Sequence analysis? Sequence alignment? Structure and function prediction? Gene finding

Expression analysis? Gen expression analysis? Gene clustering

Page 12: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

12

23

The Complete Microarray Bioinformatics Solution

DataManagement

Databases

StatisticalAnalysis

ImageProcessing

Automation

DataMining

ClusterAnalysis

24

Bioinformatics as Information Technology

Bioinformatics

InformationRetrieval

GenBank

SWISS-PROT

Hardware

Agent

MachineLearning

Algorithm

Supercomputing

Information filtering

Monitoring agent

ClusteringRule discovery

Pattern recognition

Sequence alignment

Biomedical text analysis

Database

Page 13: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

13

25

Bioinformatics on the Web

sample

array

hybridization

scanner

relational

database

Data management

The experimental process

webinterface

image analysis results andsummaries

links to otherinformation resources

downloaddata to otherapplications

Data analysis and interpretation

26

Biocomputing

Page 14: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

14

27

Biocomputing vs. Bioinformatics

BTIT

Bioinformatics

Biocomputing

28

Traveling Salesman Problem

The traveling salesman problem: as the number of cities grows, even supercomputers have difficulty finding the shortest path.

10

3

2 5

6

4

Page 15: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

15

29

Adleman’s Molecular Computer: A Brute Force Method

Each city (vertex) is represented by a

different sequence of nucleotides (6

here, but Adlemanused 20).

A DNA linker (edge) joining two city

(vertex) strands.

30

AGCTTAGG

ATGGCATG

ATCCTACC

Vertex 1 Vertex 2

Edge 1? 2

Step 1 : Hybridization

AGCTTAGG ATGGCATGATCC TACC

AGCTTAGGATCCTACC

Step 2 : Ligation

AGCTTAGGATGGCATGGAATCCGATGCATGGCTCGAATCC ACGTACCG

Vertex 1

ATGGCATG

Vertex 4

Step 3 : PCR

32 bp 16 bp

Step 4 : Gel Electrophoresis

AGCTTAGGATGGCATGGAATCCGA…TCGAATCC

Bead for vertex 1

Step 5 : Magnetic Bead Affinity Separation

Page 16: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

16

31

Molecular Operators for DNA Computing

•Hybridization: complementary pairing of two single-stranded polynucleotides

5’- AGCATCCA –3’

3’- TCGTAGGT –5’

+5’- AGCATCCA –3’3’- TGCTAGGT –5’

•Ligation: attaching sticky ends to a blunt-ended molecule

TGACTACGACTG

ATGCATGCTACG

+ ATGCATGCTGACTACGTACGTGAC

sticky end

32

DNA finds a solution!

A Hamiltonian path with all vertices included is isolated and recovered

Page 17: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

17

33

Why DNA Computing?

? 6.022 ? 1023 molecules / mole? Immense, Brute Force Search of All Possibilities

? Desktop: 109 operations / sec? Supercomputer: 1012 operations / sec? 1 ? mol of DNA: 1026 reactions

? Favorable Energetics: Gibb’s Free Energy

? 1 J for 2 ? 1019 operations? Storage Capacity: 1 bit per cubic nanometer

-1mol 8kcalG ???

34

DNA Computers vs. Conventional Computers

electronic data are vulnerable but can be backed up easily

DNA is sensitive to chemical deterioration

setting up only requires keyboard input

setting up a problem may involve considerable preparations

smaller memorycan provide huge memory in small space

can do substantially fewer operations simultaneously

can do billions of operationssimultaneously

fast at individual operationsslow at individual operations

Microchip-based computersDNA-based computers

Page 18: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

18

35

Research Groups

?MIT, Caltech, Princeton University, Bell Labs ? EMCC (European Molecular Computing

Consortium) is composed of national groups from 11 European countries

? BioMIP Institute (BioMolecular Information Processing) at the German National Research Center for Information Technology (GMD)

?Molecular Computer Project (MCP) in Japan? Leiden Center for Natural Computation (LCNC)

36

Applications of BiomolecularComputing? Massively parallel problem solving? Combinatorial optimization? Molecular nano-memory with fast associative search? AI problem solving? Medical diagnosis? Cryptography? Drug discovery? Further impact in biology and medicine :

? Wet biological data bases ? Processing of DNA labeled with digital data ? Sequence comparison ? Fingerprinting

Page 19: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

19

37

Biochips

38

DNA Chip

Page 20: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

20

39

DNA Chip Technology

40

Classification of DNA Chip Technology

Photolithography

Inkjetting

Mechanical micro-spotting

Page 21: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

21

41

How DNA Chips Are Made

42

Photolithography Chip

.Light-directed Oligonucleotide Synthesis

Page 22: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

22

43

Microarray Robot

44

DNA Chip Applications

? Gene discovery: gene/mutated gene? Growth, behavior, homeostasis …

? Disease diagnosis? Drug discovery: Pharmacogenomics? Toxicological research: Toxicogenomics

Page 23: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

23

45

Protein Chips

? A new paradigm in protein molecular mapping strategies

46

Bioelectronic Devices

Au Coated Glass

Bio-Memory Device

Au

Cyt c

GFP

Glass

Electron Sensitizer

Electron Acceptor

Patterned Bio-Film

Page 24: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

24

47

History of Lab-on-a-Chip

48

Integrates sample handling, separation and detection and data analysis for: DNA, RNA and protein solutions using LabChip technology.

Lab-on-a-chip Technology

Page 25: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

25

49

Biointelligence

Concept and HistoryMethodologyTechnologyApplications

50

Concept and History

Page 26: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

26

51

Biointelligence (BI)

? Study of artificial intelligence based on biotechnology

? Biointelligence as a new technology? Solving AI problems using biotechnology (BT) or BIT? Using BT to solve AI problems

? Biointelligence as a new application? Using AI techniques to solve BT problems

? Biointelligence as a new research field? Biochemistry = Biology + Chemistry? Bioinformatics = Biology + Informatics? Biointelligence (BI) = Biology (BT) + Intelligence (AI)

52

Relationships to Existing Research Areas

Information Technology

(IT)

AIBioinformationTechnology (BIT)

Biotechnology(BT)

Biointelligence(BI)

Page 27: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

27

53

Related Research Fields

Artificial Intelligence

BiointelligenceBioinformatics Biocomputing

Biochips Bioinformation Technology

54

Biological AI: History

Symbolic AI

• 1943: Production rules • 1956: “Artificial Intelligence”• 1958: LISP AI language• 1965: Resolution theorem

proving

• 1970: PROLOG language• 1971: STRIPS planner• 1973: MYCIN expert system• 1982-92: Fifth generation

computer systems project• 1986: Society of mind

• 1994: Intelligent agents

Biological AI

• 1943: McCulloch-Pitt’s neurons • 1959: Perceptron• 1965: Cybernetics• 1966: Simulated evolution• 1966: Self-reproducing automata

• 1975: Genetic algorithm

• 1982: Neural networks• 1986: Connectionism• 1987: Artificial life

• 1992: Genetic programming• 1994: DNA computing

Page 28: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

28

55

Paradigm Shift in AI Research

?Symbolic Subsymbolic

?Knowledge-based

Learning-based

?Deduction Induction

?Model-driven Data-driven

?Top-down Bottom-up

?High-level Low-level

?Reflective Reflexive

?Individual Collective

?Deep-thought Reactive behavior

?Syntactic Semantic

?Discrete Continuous

?Deterministic Stochastic

?Logic Probabilistic

56

Computers and Biosystems

(Moravec, 1988)

Page 29: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

29

57

Biointelligence Methodology

58

Four Levels of Biointelligence

Molecular Intelligence

Cellular Intelligence

Organismic Intelligence

Ecological Intelligence

<= Focus of classical AI

Page 30: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

30

59

Comparison of BiointelligenceTechnologies

evolvablehardware

neurochipsembryonic chipslab-on-a-chip

DNA chipsprotein chips

Chips

evolutionaryalgorithms

neural netssemantic nets

cell-automataimmune nets

DNA/molecularcomputing

Computationalmodels

cooperationcompetition

excitationinhibition

cell divisiondifferentiation

ligationhybridization

Basic operation

audiovisual,symbolic

neuro-transmitters

electrochemicalsignals

lock-keymechanism

Communication

yearsmonthsdayssecondsTime (typical)

evolutionlearningdevelopmentself-assemblyPhenomenon

ecologyneurobiologycell biologyMolecularbiology

Biology

populationorganismcellsmoleculesBasic unit

EcologicalIntelligence

OrganismicIntelligence

CellularIntelligence

MolecularIntelligence

60

Biomolecular Information Processing

DNA Sequence

mRNA Sequence

Protein Sequence

Folded Protein

Transcription

Translation

Folding

Page 31: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

31

61

Features

? Stochastic (vs. deterministic)?Massively parallel (vs. sequential)? Self-assembly (vs. programming)? Liquid rather than solid-state? Biochemical (vs. electronic)? Biomolecule-based (vs. silicon-based)

62

Principles and Theoretical Toolsfor Biointelligence Research

? Self-Assembly? Self-Reproduction? Uncertainty Principle? Occam’s Razor Principle

? Information Theory? Probability Theory? Thermodynamics? Statistical Physics

Page 32: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

32

63

Biology-Based AI Models: Existing Examples

Evolutionary Computation:

computational method

simulating natural selection

DNA Computing: information

processing based on

biomolecules

Neural Networks: computation

model imitating brain structure

64

Neural Computation: The Brain as Computer

1. 1011 neurons with1014 synapses

2. Speed: 10-3 sec3. Distributed processing4. Nonlinear processing5. Parallel processing

1. A single processor with complex circuits

2. Speed: 10 –9 sec 3. Central processing4. Arithmetic operation (linearity) 5. Sequential processing

Page 33: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

33

65

From Biological Neurons to Artificial Neurons

66

“Owing to this struggle for life, any variation, however slight and from whatever cause proceeding, if it be in any degree profitable to an individual of any species, in its infinitely complex relations to other organic beings and to external nature, will tend to the preservation of that individual, and will generally be inherited by its offspring.”

Origin of Species “Charles Darwin (1859)”

Evolutionary Computation: Nature as Computer

Page 34: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

34

67

Variation and Selection: The Principle

solutions

1100101010101110111000110110011100110001

110010 1110

101110 1110110010 1010

crossover

mutation

00110

101110 1010

10011

00110 10010

evaluation

110010111010111010100011001001

solutions

fitnesscomputation

roulettewheel

selectionnew population

encoding

chromosomes

68

DNA Computing: BioMoleculesas Computer

011001101010001 ATGCTCGAAGCT

Page 35: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

35

69

HPP

...

......

...ATG

ACG

TGC

CGA

TAA

GCA

CGT...

...

...

...... ...

...

...

10

3

2 56

4

Solution

ATGTGCTAACGAACG

ACGCGAGCATAAATGTGCCGT

TAAACG

CGACGT

TAAACGGCAACG

...

...

...

...

CGACGTAGCCGT

...

...

...

ACGCGAGCATAAATGTGCCGTACGCGTAGCCGT

ACGCGT

......

...

...

...

ACGGCATAAATGTGCACGCGTACGCGAGCATAAATGCGATGCCGT

ACGCGAGCATAAATGTGCCGT

...... ......

...

ACGCGAGCATAAATGTGCCGT

...

.........

...

Decoding

Ligation

Encoding

Gel Electrophoresis

Affinity Column

ACGCGAGCATAAATGTGCACGCGT

ACGCGAGCATAAATGCGATGCACGCGT

ACGCGAGCATAAATGTGCACGCGT

ACGCGAGCATAAATGCGATGCACGCGT

2

0 13 4

56

Node 0: ACG Node 3: TAANode 1: CGA Node 4: ATGNode 2: GCA Node 5: TGC

Node 6: CGT

Flow of DNA Computing

PCR(Polymerase

Chain Reaction)

70

Biointelligence Technology

Page 36: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

36

71

Biointelligence on a Chip?

Biological Computer

MolecularElectronics

BioinformationTechnology

Computing Models:The limit of conventional computing models

Computing Devices : The limit of siliconesemiconductor technology

Information Technology

Biotechnology

Biointelligence Chip

72

Intelligent BiomolecularInformation Processing

Bio-Memory Biocomputing

Theoretical Models

S

GFP

Cytochrome c

S

GFP

Cytochrome c

Bio-Processor

Input AInput AController

OutputReaction Chamber

(Calculating)

Page 37: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

37

73

분자 컴퓨터 모델

Bio-diode 소자

• 단일 전자 소자• Bio-transistor 구성• Bio-memory

Bio-logic gate 소자

• 단일 전자 소자• 직렬 processor• Thz급 처리속도

One-chip 적용

분자 연산 소자

• 병렬 processor• Thz급 처리속도

(CPU)

74

Evolvable Biomolecular Hardware

? Sequence programmable and evolvable molecular systems have been constructed as cell-free chemical systems using biomolecules such as DNA and proteins.

Page 38: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

38

75

Molecular Storage for Massively Parallel Information Retrieval

Trillions of DNA

경기도 구리시 아천동 246-2648-7921원 빈

인천시 남구 주안5동 23-1352-4730송승헌

서울송파구 잠실본동 211419-1332홍길동

주 소전화번호성 명

서울시 영등포구신길 2동 11418-9362송혜교

전화번호부

76

The ‘Knight Problem’

? Given an n x n chess board, what position can a knight occupy such that no knight can attack another knight.

? An example of SAT? NP-complete for infinite boards? Example: 3 x 3 Board

Page 39: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

39

77

Three Solutions to the ‘Knight Problem’

? Problem solved: 3 of the 31 solutions to the knight conundrum found by the RNA-based machine

78

Solving Logic Problems by Molecular Computing

? Satisfiability Problem? Find Boolean values for

variables that make the given formula true

? 3-SAT Problem? Every NP problems can be

seen as the search for a solution that simultaneously satisfies a number of logical clauses, each composed of three variables.

)or or ( AND )or or ()or or ( AND )or or (

321321

654321

xxxxxxxxxxxx

)()()( 324431 xxxxxx ?????

Page 40: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

40

79

DNA Chips for DNA Computing

I. Make: oligomer synthesis

II. Attach (Immobilized): 5’HS-C6-T15-CCTTvvvvvvvvTTCG-3’

III. Mark: hybridization

IV. Destroy: Enzyme rxn (ex.EcoRI)

V. Unmark* 문제를 만족시키지 않는 모든 strand

제거

VI. Readout: N cycle의 마지막 단계에 해가 남게 되면, PCR로 증폭하여 확인!

80

Variable Sequences and the Encoding Scheme

Page 41: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

41

81

Tree-dimensional Plot and Histogram of the Fluorescence

? S3: w=0, x=0, y=1, z=1? S7: w=0, x=1, y=1, z=1? S8: w=1, x=0, y=0, z=0? S9 : w=1, x=0, y=0, z=1

? y=1: (w V x V y) 만족

? z=1: (w V y V z) 만족

? x=0 or y=1: (x V y) 만족

? w=0: (w V y) 만족

? Four spots with high fluorescence intensity correspond to the four expected solutions.

? DNA sequences identified in the readout step via addressed array hybridization.

82

Applied Biointelligence

Bio-based AI Methods for Solving Bio-problems

Page 42: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

42

83

Spillover of Biointelligence

Understanding information flow in biological construction

HealthcareDrugs Foods

Analysis, modeling and management tools

84

Multilayer Perceptrons for Gene Finding and Prediction

Coding potential value

GC Composition

Length

Donor

Acceptor

Intron vocabulary

basesDiscrete

exon score

0

1

sequence

score

Page 43: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

43

85

Self-Organizing Maps for DNA Microarray Data Analysis

Two-dimensional arrayof postsynaptic neurons

Bundle of synapticconnections

Winningneurons

Input

86

Biological Information ExtractionText Data

DB

LocationDate

DB Record

Database TemplateFilling

Data Analysis &Field Identify

Data Classify &Field Extraction

Information Extraction

Field PropertyIdentify & Learning

Page 44: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

44

87

Medical Biointelligence

Automation of genome expressionanalysis

Integration ofmolecular data

Inference andmodeling systems

Molecular classification of cancer

Diagnosissystems

Organismmodeling

Drug design

Key aspects addressed Goal

88

E-Doctor

Diagnosis Expert System

Self-diagnosisPharmacy

Hospital

Personal Medicine

Page 45: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

45

89

Biorobotics

? Robot = Mechanical + Electronic (+ Biological)? Biorobot = Biological + (Mechanical + Electronic)? Biological Robots with Biointelligence

? Self-reproduction? Evolution? Learning

90

Conclusions

? IT gets a growing importance in the advancement of BT (e.g., bioinformatics).

? IT can benefit much from BT (e.g., biocomputing and biochips)

? Bioinformation technology (BIT) is essential as a next-generation information technology.

? From the AI point of view, biosystems are existing proofs of intelligent systems.

? Biointelligence defined as a study of artificial intelligence based on biotechnology is a new technology and application area at the intersection of BT and IT.

? Biological AI technologies can provide a short cut for building AI machines.

Page 46: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

46

91

“The interface between biological systems and computational systems will become blurred, allowing powerful computational control of biological systems and implantation of computer interfaces into the human brain. Biology will be become the dominant metaphor for computer science, providing a framework for understanding and constructing complex computations.”

- Mark Gerstein

92

Further Information

Page 47: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

47

93

Journals & Conferences? Journals

? Biological Cybernetics (Springer)? BioSystems(Elsevier)? Artificial Intelligence in Medicine? Bioinformatics (Oxford University Press)? Computer Applications in the Bioscience (Oxford University Press)? Computers in Biology and Medicine (Elsevier)? IEEE Transactions on Biomedical Engineering? IEEE Transactions on Evolutionary Computation

? Conferences? International Conference on Intelligent Systems for Molecular Biology (ISMB)? Pacific Symposium on Biocomputing (PSB)? International Conference on Computational Molecular Biology (RECOMB)? IBC’s Annual Conference on Biochip Technologies? International Meeting on DNA Based Computers? IEEE Bioinformatics and Bioengineering Symposium (BIBE)? International Symposium on Medical Data Analysis (ISMDA)

94

Web Resources: Bioinformatics

? ANGIS - The Australian National Genomic Information Service: http://morgan.angis.su.oz.au/

? Australian National University (ANU) Bioinformatics: http://life.anu.edu.au/? BioMolecular Engineering Research Center (BMERC): http://bmerc-www.bu.edu/ ? Brutlag bioinformatics group: http://motif.stanford.edu/? Columbia University Bioinformatics Center (CUBIC): http://cubic.bioc.columbia.edu/? European Bioinformatics Institute (EBI): http://www.ebi.ac.uk/? European Molecular Biology Laboratory (EMBL): http://www.embl-heidelberg.de/ ? Genetic Information Research Institute: http://www.girinst .org/? GMD-SCAI: http://www.gmd.de/SCAI/scai_home.html ? Harvard Biological Laboratories: http://golgi.harvard.edu/ ? Laurence H. BakerCenter for Bioinformatics and Biological Statistics:

http://www.bioinformatics.iastate.edu/? NASA Center for Bioinformatics: http://biocomp.arc.nasa.gov/? NCSA Computational Biology: http://www.ncsa.uiuc.edu/Apps/CB/? Stockholm Bioinformatics Center: http://www.sbc.su.se/? USC Computational Biology: http://www-hto.usc.edu/? W. M. Keck Center for Computational Biology : http://www-bioc.rice.edu/

Page 48: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

48

95

Web Resources: Biocomputing

? European Molecular Computing Consortium (EMCC): http://www.csc.liv.ac.uk/~emcc/

? BioMolecular Information Processing (BioMip): http://www.gmd.de/BIOMIP

? Leiden Center for Natural Computation (LCNC): http://www.wi.leidenuniv.nl/~lcnc/

? Biomolecular Computation (BMC): http://bmc.cs.duke.edu/

? DNA Computing and Informatics at Surfaces: http://www.corninfo.chem.wisc.edu/writings/DNAcomputing.html

? SNU Molecular Evolutionary Computing (MEC) Project:http://scai.snu.ac.kr/Research/

96

Web Resources: Biochips

? DNA Microarry (Genome Chip): http://www.gene-chips.com/

? Large-Scale Gene Expression and Microarray Link and Resources: http://industry.ebi.ac.uk/~alan/MicroArray/

? The Microarray Centre at The Ontario Cancer Institute: http://www.oci.utoronto.ca/services/microarray/

? Lab-on-a-Chip resources: http://www.lab-on-a-chip.com/

?Mailing List: [email protected]

Page 49: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

49

97

Books: Bioinformatics

? Cynthia Gibas and Per Jambeck, Developing Bioinformatics Computer Skills, O’REILLY, 2001.

? Peter Clote and Rolf Backofen, Computational Molecular Biology: An Introduction, A John Wiley & Sons, Inc., 2000.

? Arun Jagota, Data Analysis and Classification for Bioinformatics, 2000.

? Hooman H. Rashidi and Lukas K. Buehler, Bioinformatics Basics Applications in Biological Science and Medicine, 1999.

? Pierre Baldi and Soren Brunak, Bioinformatics: The Machine Learning Approach, MIT Press, 1998.

? Andreas Baxevanis and B. F. Francis Ouellette, Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, A John Wiley & Sons, Inc., 1998.

98

Books: Biocomputing

? Cristian S, Calude and Gheorghe Paun, Computing with Cells and Atoms: An introduction to quantum, DNA and membrane computing , Taylor & Francis, 2001.

? Pâun, G., Ed., Computing With Bio-Molecules: Theory and Experiments, Springer, 1999.

? Gheorghe Paun, Grzegorz Rozenberg and Arto Salomaa, DNA Computing, New Computing Paradigms, Springer, 1998.

? C. S. Calude, J. Casti and M. J. Dinneen, Unconventional Models of Computation, Springer, 1998.

? Tono Gramss, Stefan Bornholdt, Michael Gross, Melanie Mitchell and thomas Pellizzari, Non-Standard Computation: Molecular Computation-Cellular Automata-Evolutionary Algorithms-Quantum Computers, Wiley-Vch, 1997.

Page 50: Bioinformation Technology (BIT) and Biointelligence (BI) · In vivo Informatics (e.g. Biocomputing) 4 7 Bioinformation Technology Bioinformatics Biocomputing Biochips 8 Bioinformatics

50

99

For more information:

http://scai.snu.ac.kr/