23
DNA computing-based Implementation of Decision tree Advanced AI 컴컴컴컴컴컴 컴 컴컴 컴컴컴컴 컴컴 컴컴 컴 컴컴 컴컴컴컴컴 컴컴 컴컴 컴 컴컴

DNA computing-based Implementation of Decision tree

Embed Size (px)

DESCRIPTION

DNA computing-based Implementation of Decision tree. Advanced AI 컴퓨터공학부 임 예니 인지과학 협동 과정 이 은석 생물정보학 협동 과정 조 성범. 유전자 1. class. 유전자 2. class. 0. 환자 1. 0. 1. 0. 환자 2. 0. 1. 0. 0. 0. 환자 3. 1. 0. 0. 1. 1. 1. 0. 환자 4. Decision Tree using DNA computing. - PowerPoint PPT Presentation

Citation preview

Page 1: DNA computing-based Implementation of  Decision tree

DNA computing-based Implementation of

Decision tree

Advanced AI

컴퓨터공학부 임 예니인지과학 협동 과정 이 은석

생물정보학 협동 과정 조 성범

Page 2: DNA computing-based Implementation of  Decision tree

Decision Tree using DNA computing

• Input strand organization At each attribute, instance value and

class label was coupled

After hybridization, length of strand means number of instances

     

 

유전자 1

class

유전자2

class

 

환자 1

0100 

환자 2

1000 

환자 3

1000 

환자 4

1101 

  

Page 3: DNA computing-based Implementation of  Decision tree

Gene A

Class Gene B

Class Gene C

Class

Patien

t 1

0 0 0 0 1 0

2 0 1 0 1 0 1

3 1 1 1 0 1 1

4 0 1 0 1 1 1

5 0 1 0 1 0 1

Page 4: DNA computing-based Implementation of  Decision tree

Cy5

{(00),(01),(10),(11)}5’ Sticky end Sticky end 3’

Page 5: DNA computing-based Implementation of  Decision tree

(0,0)

(0,1)

(1,0)

(1,1)

Page 6: DNA computing-based Implementation of  Decision tree

Calculation of Information Gain

• Information Gain(S,A) ≡ Entropy(S) - ∑(|Sv|/|S|)*Entropy(Sv)

= (|S0|/|S|)*Entropy(S0) +(|S1|/|S|)*Entropy(S1) In gene expression data, all attribute values are encode

d in binary mode.

Page 7: DNA computing-based Implementation of  Decision tree
Page 8: DNA computing-based Implementation of  Decision tree

(|S0|/|S|)*Entropy(S0) ≈ (|S0|/|S|)*(n1/|S0|)                          ≈ n1/|S|

(|S1|/|S|)*Entropy(S1) ≈ (|S1|/|S|)*(n2/|S1|)                        ≈ n2/|S|

Page 9: DNA computing-based Implementation of  Decision tree

∑(|Sv|/|S|)*Entropy(Sv) =

(|S0|/|S|)*Entropy(S0)+(|S1|/|S|)*Entropy(S1)

≈ (|S0|/|S|)*(n1/|S0|) + (|S1|/|S|)*(n2/|S1|)

≈ n1/|S|+ n2/|S| ≈ n1+n2

Page 10: DNA computing-based Implementation of  Decision tree

36822

1894 34836

1915 38982

(0,0) 13 39 24 46 54(0,1) 49 6 55 14 24(1,0) 47 21 36 14 6(1,1) 11 54 5 46 36

Page 11: DNA computing-based Implementation of  Decision tree

36822=0

1894 34836

1915 38982

(0,0) 11 7 13 13

(0,1) 6 46 13 21

(1,0) 2 6 0 0

(1,1) 43 3 36 28

Page 12: DNA computing-based Implementation of  Decision tree

1894=0

34836 1915 38982

(0,0) 5 11 11

(0,1) 6 1 1

(1,0) 0 0 1

(1,1) 6 5 4

Page 13: DNA computing-based Implementation of  Decision tree

DNA computing Vs Digital computing

• Rules from DNA computing

36822=0 -1894=0 -1915=0:0 -1915=1:1 -1894=1:1

Identical to conventional decision tree algorithm

Page 14: DNA computing-based Implementation of  Decision tree

Input Sequence<00>/<01>/<10>/<11>

5’ Sticky end Sticky end 3’

GCATAG GAAATGAGTT CTTTACTCAA CGTATC

ATAGGC TGATGCTACA ACTACGATGT TATCCG

AGGCAT GGTTGTGGCG CCAACACCGC TCCGTA

ATAGGA CAGTTATTTC GTCAATAAAG TATCCT

<00><01><10>

<11>

Page 15: DNA computing-based Implementation of  Decision tree

Implementation steps

1. Rule representing sequence 2. Hybridization 3. Construction random paths 4. Florescence detection: Check if a

specific rule appeared sequentially

5. Repeating step 3-5

Page 16: DNA computing-based Implementation of  Decision tree

Simulation Results

• 1st: each rule sequences: 1000,900,800,700 hybridization #: 1000

1st

0100200300400500600700800900

<00> <01> <10> <11>

연속

출현

시퀀

스수

1계열

Page 17: DNA computing-based Implementation of  Decision tree

Simulation Results

• 2nd:

2nd

0100200300400500600700800900

1000

<00> <01> <10> <11>

속출

현시

퀀스

1계열

Page 18: DNA computing-based Implementation of  Decision tree

Simulation Results

• 3rd:

3rd

0

100

200

300

400

500

600

700

<00> <01> <10> <11>

연속

출현

시퀀

스수

1계열

Page 19: DNA computing-based Implementation of  Decision tree

Simulation Results

• 4th:

4th

0100200300400500600700800900

1000

<00> <01> <10> <11>

연속

출현

시퀀

스수

1계열

Page 20: DNA computing-based Implementation of  Decision tree

(0,0) ; 0.85

(0,1) ; 0.91

(1,0) ; 0.62

(1,1) ; 0.87

Summary of Simulation Results

Page 21: DNA computing-based Implementation of  Decision tree

36822

1894 34836

1915 38982

(0,0) 13:11.05

39:33.15

24:20.4

46:39.1

54:45.9

(0,1) 49:44.59

6:5.46 55:50.05

14:12.74

24:21.84

(1,0) 47:29.14

21:13.1

36:22.32

14:8.68

6:3.72

(1,1) 11:9.57

54:4.69

5:4.35 46:40.2

36:31.3

Calculation of Root Node

with Simulation Results

Page 22: DNA computing-based Implementation of  Decision tree

Validation of decision tree resulting from DNA computing

and digital computing

Number of gene

Digital computi

ng

DNA computi

ng

3 82% 70%

5 90% 75%

Page 23: DNA computing-based Implementation of  Decision tree

Discussion

• Due to unspecific hybridization, simulation results were different from that of calculation

• Lack of Pruning process

• Cost

• More specific hybridization process